Preview image acquisition user interface for linear panoramic image stitching

ABSTRACT

A system and method that allows the capture of a series of images to create a single linear panoramic image is disclosed. The method includes capturing an image, dynamically comparing a previously captured image with a preview image on a display of a capture device until a predetermined overlap threshold is satisfied, generating a user interface to provide feedback on the display of the capture device to guide a movement of the capture device, and capturing the preview image with enough overlap with the previously captured image with little to no tilt for creating a linear panorama.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims priority, under 35 U.S.C. §119, to U.S.Provisional Patent Application No. 62/105,189, filed Jan. 19, 2015entitled “Image Acquisition User Interface for Linear Panoramic ImageStitching,” and to U.S. Provisional Patent Application No. 62/127,750,filed Mar. 3, 2015 entitled “Image Acquisition User Interface for LinearPanoramic Image Stitching,” which are incorporated by reference in theirentirety.

BACKGROUND

Field of the Invention

The specification generally relates to providing a user interface forguiding the user to capture a series of images to create a single linearpanoramic image. In particular, the specification relates to a systemand method for generating one or more user interface elements thatprovide instantaneous feedback to guide the user in capturing the seriesof images to create the single linear panoramic image.

Description of the Background Art

A planogram is a visual representation of products in a retailenvironment. For example, a planogram may describe where in the retailenvironment and in what quantity products should be located. Suchplanograms are known to be effective tools for increasing sales,managing inventory and otherwise ensuring that the desired quantity andsizes of an item are placed to optimize profits or other parameters.However, presentation and maintenance of adequate levels of stock onshelves, racks and display stands is a labor-intensive effort, therebymaking enforcement of planograms difficult. While the location andquantity of products in retail stores can be manually tracked by a user,attempts are being made to automatically recognize the products andautomatically or semi-automatically obtain information about the stateof products.

Previous attempts at recognizing products have deficiencies. Forexample, one method to achieve the goal of recognizing multiple productsfrom multiple images is through image stitching. Unfortunately, existingimage stitching techniques can lead to artifacts and can interfere withthe optimal operation of recognition.

SUMMARY

The techniques introduced herein overcome the deficiencies andlimitations of the prior art, at least in part, with a system and methodfor capturing a series of images to create a linear panorama. In oneembodiment, the system includes an image recognition application. Theimage recognition application is configured to receive an image of aportion of an object of interest from a capture device and to determinethe features of the image. The image recognition application is furtherconfigured to generate a user interface including a current previewimage of the object of interest on a display of the capture device andto compare dynamically the features of the image with the currentpreview image of the object of interest on the display of the capturedevice to determine overlap. The image recognition application isfurther configured to update the user interface to include a firstvisually distinct indicator to guide a movement of the capture device toproduce the overlap and to determine whether the overlap between theimage and the current preview image satisfies a predetermined overlapthreshold. The image recognition application is further configured tocapture a next image of the portion of the object of interest using thecapture device based on the overlap satisfying the predetermined overlapthreshold.

Other aspects include corresponding methods, systems, apparatuses, andcomputer program products for these and other innovative aspects.

The features and advantages described herein are not all-inclusive andmany additional features and advantages will be apparent to one ofordinary skill in the art in view of the figures and description.Moreover, it should be noted that the language used in the specificationhas been principally selected for readability and instructional purposesand not to limit the scope of the techniques described.

BRIEF DESCRIPTION OF THE DRAWINGS

The techniques introduced herein are illustrated by way of example, andnot by way of limitation in the figures of the accompanying drawings inwhich like reference numerals are used to refer to similar elements.

FIG. 1 is a high-level block diagram illustrating one embodiment of asystem for capturing a series of images to create a linear panorama.

FIG. 2 is a block diagram illustrating one embodiment of a computingdevice including an image recognition application.

FIG. 3 is a flow diagram illustrating one embodiment of a method forcapturing a series of images of an object of interest under a guidanceof direction for a single linear panoramic image.

FIGS. 4A-4B are flow diagrams illustrating one embodiment of a methodfor capturing a series of images of an object of interest in adirectionally guided pattern for generating a single linear panoramicimage.

FIGS. 5A-5B are flow diagrams illustrating another embodiment of amethod for capturing a series of images of an object of interest in adirectionally guided pattern for generating a single linear panoramicimage.

FIGS. 6A-6B are flow diagrams illustrating one embodiment of a methodfor realigning the current preview image with a previously capturedimage of an object of interest.

FIG. 7A is a graphical representation of an embodiment of a userinterface for capturing an image of a shelf.

FIG. 7B is a graphical representation of another embodiment of the userinterface for capturing an image of a shelf.

FIG. 8 is a graphical representation of one embodiment of an overlapbetween images captured of an object of interest.

FIG. 9 is a graphical representation of one embodiment of the imagematching process for generating the visually distinct indicator foroverlap

FIGS. 10A-10D are graphical representations of embodiments of the userinterface displaying a visually distinct indicator for overlap when theclient device moves in a left-to-right direction.

FIGS. 11A-11D are graphical representations of embodiments of the userinterface displaying a visually distinct indicator for overlap when theclient device moves in a bottom-to-top direction.

FIGS. 12A-12C are graphical representations of embodiments of the userinterface displaying a visually distinct indicator for tilt when theclient device is rolling about the Z axis.

FIGS. 13A-13C are graphical representations of embodiments of the userinterface displaying a visually distinct indicator for tilt when theclient device is pitching about the X axis.

FIGS. 14A-14B are graphical representations of embodiments of the userinterface displaying visually distinct indicator for tilt when theclient device is tilting in both X and Z axes.

FIG. 15 is a graphical representation of one embodiment of therealignment process for generating the visually distinct indicator forrealignment.

FIGS. 16A-16D are graphical representations of embodiments of the userinterface displaying realigning current preview image displayed on aclient device with a previously captured image.

FIGS. 17A-17F are graphical representations illustrating another set ofembodiments of the user interface displaying realigning current previewimage displayed on a client device with a previously captured image.

FIG. 18 is a graphical representation of one embodiment of theserpentine scan pattern of image capture.

FIG. 19 is a graphical representation of one embodiment of constructinga mosaic preview using images of a shelving unit.

FIGS. 20A-20I are graphical representations of embodiments of the userinterface displaying visually distinct indicator for direction ofmovement of the client device.

FIG. 21 is a graphical representation of another embodiment of the userinterface displaying visually distinct indicator for direction ofmovement of the client device.

FIGS. 22A-22B are graphical representation of embodiments of the userinterface previewing the set of captured images in a mosaic.

DETAILED DESCRIPTION

FIG. 1 is a high-level block diagram illustrating one embodiment of asystem 100 for capturing a series of images to create a linear panorama.The illustrated system 100 may have one or more client devices 115 a . .. 115 n that can be accessed by users and a recognition server 101. InFIG. 1 and the remaining figures, a letter after a reference number,e.g., “115 a,” represents a reference to the element having thatparticular reference number. A reference number in the text without afollowing letter, e.g., “115,” represents a general reference toinstances of the element bearing that reference number. In theillustrated embodiment, these entities of the system 100 arecommunicatively coupled via a network 105.

The network 105 can be a conventional type, wired or wireless, and mayhave numerous different configurations including a star configuration,token ring configuration or other configurations. Furthermore, thenetwork 105 may include a local area network (LAN), a wide area network(WAN) (e.g., the Internet), and/or other interconnected data pathsacross which multiple devices may communicate. In some embodiments, thenetwork 105 may be a peer-to-peer network. The network 105 may also becoupled to or include portions of a telecommunications network forsending data in a variety of different communication protocols. In someembodiments, the network 105 may include Bluetooth communicationnetworks or a cellular communications network for sending and receivingdata including via short messaging service (SMS), multimedia messagingservice (MMS), hypertext transfer protocol (HTTP), direct dataconnection, WAP, email, etc. Although FIG. 1 illustrates one network 105coupled to the client devices 115 and the recognition server 101, inpractice one or more networks 105 can be connected to these entities.

In some embodiments, the system 100 includes a recognition server 101coupled to the network 105. In some embodiments, the recognition server101 may be either a hardware server, a software server, or a combinationof software and hardware. The recognition server 101 may be, or may beimplemented by, a computing device including a processor, a memory,applications, a database, and network communication capabilities. In theexample of FIG. 1, the components of the recognition server 101 areconfigured to implement an image recognition application 103 a describedin more detail below. In one embodiment, the recognition server 101provides services to a consumer packaged goods firm for identifyingproducts on shelves, racks, or displays. While the examples hereindescribe recognition of products in an image of shelves, such as aretail display, it should be understood that the image may include anyarrangement of organized objects. For example, the image may be of awarehouse, stockroom, store room, cabinet, etc. Similarly, the objects,in addition to retail products, may be tools, parts used inmanufacturing, construction or maintenance, medicines, first aidsupplies, emergency or safety equipment, etc.

In some embodiments, the recognition server 101 sends and receives datato and from other entities of the system 100 via the network 105. Forexample, the recognition server 101 sends and receives data includingimages to and from the client device 115. The images received by therecognition server 101 can include an image captured by the clientdevice 115, an image copied from a website or an email, or an image fromany other source. Although only a single recognition server 101 is shownin FIG. 1, it should be understood that there may be any number ofrecognition servers 101 or a server cluster. The recognition server 101also includes a data storage 243, which is described below in moredetail with reference to FIG. 2.

The client device 115 may be a computing device that includes a memory,a processor and a camera, for example a laptop computer, a desktopcomputer, a tablet computer, a mobile telephone, a smartphone, apersonal digital assistant (PDA), a mobile email device, a webcam, auser wearable computing device or any other electronic device capable ofaccessing a network 105. The client device 115 provides general graphicsand multimedia processing for any type of application. For example, theclient device 115 may include a graphics processor unit (GPU) forhandling graphics and multimedia processing. The client device 115includes a display for viewing information provided by the recognitionserver 101. While FIG. 1 illustrates two client devices 115 a and 115 n,the disclosure applies to a system architecture having one or moreclient devices 115.

The client device 115 is adapted to send and receive data to and fromthe recognition server 101. For example, the client device 115 sends aquery image to the recognition server 101 and the recognition server 101provides data in JavaScript Object Notation (JSON) format about one ormore objects recognized in the query image to the client device 115. Theclient device 115 may support use of graphical application programinterface (API) such as Metal on Apple iOS™ or RenderScript on Android™for determination of feature location and feature descriptors on theclient device 115.

The image recognition application 103 may include software and/or logicto provide the functionality for capturing a series of images to createa linear panorama. In some embodiments, the image recognitionapplication 103 can be implemented using programmable or specializedhardware, such as a field-programmable gate array (FPGA) or anapplication-specific integrated circuit (ASIC). In some embodiments, theimage recognition application 103 can be implemented using a combinationof hardware and software. In other embodiments, the image recognitionapplication 103 may be stored and executed on a combination of theclient devices 115 and the recognition server 101, or by any one of theclient devices 115 or recognition server 101.

In some embodiments, the image recognition application 103 b may be athin-client application with some functionality executed on the clientdevice 115 and additional functionality executed on the recognitionserver 101 by image recognition application 103 a. For example, theimage recognition application 103 b on the client device 115 couldinclude software and/or logic for capturing the image, transmitting theimage to the recognition server 101, and displaying image recognitionresults. In another example, the image recognition application 103 a onthe recognition server 101 could include software and/or logic forreceiving the image, stitching the image to a mosaic view based onsufficient overlap with a previously received image and generating imagerecognition results. The image recognition application 103 a or 103 bmay include further functionality described herein, such as, processingthe image and performing feature identification.

In some embodiments, the image recognition application 103 receives animage of a portion of an object of interest from a capture device. Theimage recognition application 103 determines features of the image. Theimage recognition application 103 generates a user interface including acurrent preview image of the object of interest on a display of thecapture device. The image recognition application 103 dynamicallycompares the features of the image with the current preview image of theobject of interest to determine overlap. The image recognitionapplication 103 updates the user interface to include a visuallydistinct indicator to guide a movement of the capture device to producethe desired or prescribed overlap and alignment between the images. Theimage recognition application 103 determines whether the overlap betweenthe image and the current preview image satisfies a predeterminedoverlap and alignment thresholds. For example, an overlap threshold canbe set at 60 percent between images to be stitched together to create alinear panorama. The image recognition application 103 captures thepreview image of the portion of the object of interest based on theoverlap satisfying the predetermined overlap threshold. The operation ofthe image recognition application 103 and the functions listed above aredescribed below in more detail below with reference to FIGS. 3-15.

FIG. 2 is a block diagram illustrating one embodiment of a computingdevice 200 including an image recognition application 103. The computingdevice 200 may also include a processor 235, a memory 237, an optionaldisplay device 239, a communication unit 241, data storage 243, optionalorientation sensors 245 and an optional capture device 247 according tosome examples. The components of the computing device 200 arecommunicatively coupled by a bus 220. The bus 220 may represent one ormore buses including an industry standard architecture (ISA) bus, aperipheral component interconnect (PCI) bus, a universal serial bus(USB), or some other bus known in the art to provide similarfunctionality. In some embodiments, the computing device 200 may be theclient device 115, the recognition server 101, or a combination of theclient device 115 and the recognition server 101. In such embodimentswhere the computing device 200 is the client device 115 or therecognition server 101, it should be understood that the client device115, and the recognition server 101 may include other componentsdescribed above but not shown in FIG. 2.

The processor 235 may execute software instructions by performingvarious input/output, logical, and/or mathematical operations. Theprocessor 235 may have various computing architectures to process datasignals including, for example, a complex instruction set computer(CISC) architecture, a reduced instruction set computer (RISC)architecture, and/or an architecture implementing a combination ofinstruction sets. The processor 235 may be physical and/or virtual, andmay include a single processing unit or a plurality of processing unitsand/or cores. In some implementations, the processor 235 may be capableof generating and providing electronic display signals to a displaydevice, supporting the display of images, capturing and transmittingimages, performing complex tasks including various types of featureextraction and sampling, etc. in some implementations, the processor 235may be coupled to the memory 237 via the bus 220 to access data andinstructions therefrom and store data therein. The bus 220 may couplethe processor 235 to the other components of the computing device 200including, for example, the memory 237, the communication unit 241, theimage recognition application 103, and the data storage 243. It will beapparent to one skilled in the art that other processors, operatingsystems, sensors, displays and physical configurations are possible.

The memory 237 may store and provide access to data fix the othercomponents of the computing device 200. The memory 237 may be includedin a single computing device or distributed among a plurality ofcomputing devices as discussed elsewhere herein. In someimplementations, the memory 237 may store instructions and/or data thatmay be executed by the processor 235. The instructions and/or data mayinclude code for performing the techniques described herein. Forexample, in one embodiment, the memory 237 may store the imagerecognition application 103. The memory 237 is also capable of storingother instructions and data, including, for example, an operatingsystem, hardware drivers, other software applications, databases, etc.The memory 237 may be coupled to the bus 220 for communication with theprocessor 235 and the other components of the computing device 200.

The memory 237 may include one or more non-transitory computer-usable(e.g., readable, writeable) device, a static random access memory (SRAM)device, an embedded memory device, a discrete memory device (e.g., aPROM, FPROM, ROM), a hard disk drive, an optical disk drive (CD, DVD,Blu-Ray™, etc.) mediums, which can be any tangible apparatus or devicethat can contain, store, communicate, or transport instructions, data,computer programs, software, code, routines, etc., for processing by orin connection with the processor 235. In some implementations, thememory 237 may include one or more of volatile memory and non-volatilememory. For example, the memory 237 may include, but is not limited to,one or more of a dynamic random access memory (DRAM) device, a staticrandom access memory (SRAM) device, an embedded memory device, adiscrete memory device (e.g., a PROM, FPROM, ROM), a hard disk drive, anoptical disk drive (CD, DVD, Blu-Ray™, etc.). It should be understoodthat the memory 237 may be a single device or may include multiple typesof devices and configurations.

The display device 239 is a liquid crystal display (LCD), light emittingdiode (LED) or any other similarly equipped display device, screen ormonitor. The display device 239 represents any device equipped todisplay user interfaces, electronic images and data as described herein.In different embodiments, the display is binary (only two differentvalues for pixels), monochrome (multiple shades of one color), or allowsmultiple colors and shades. The display device 239 is coupled to the bus220 for communication with the processor 235 and the other components ofthe computing device 200. It should be noted that the display device 239is shown in FIG. 2 with dashed lines to indicate it is optional. Forexample, where the computing device 200 is the recognition server 101,the display device 239 is not part of the system, where the computingdevice 200 is the client device 115, the display device 239 is includedand is used to display the user interfaces described below withreference to FIGS. 7A, 7B, 9A-15B, 17A-171, 18 and 22A-22F.

The communication unit 241 is hardware for receiving and transmittingdata by linking the processor 235 to the network 105 and otherprocessing systems. The communication unit 241 receives data such asrequests from the client device 115 and transmits the requests to thecontroller 201, for example a request to process an image. Thecommunication unit 241 also transmits information including recognitionresults to the client device 115 for display, for example, in responseto processing the image. The communication unit 241 is coupled to thebus 220. In one embodiment, the communication unit 241 may include aport for direct physical connection to the client device 115 or toanother communication channel. For example, the communication unit 241may include an RJ45 port or similar port for wired communication withthe client device 115. In another embodiment, the communication unit 241may include a wireless transceiver (not shown) for exchanging data withthe client device 115 or any other communication channel using one ormore wireless communication methods, such as IEEE 802.11, IEEE 802.16,Bluetooth® or another suitable wireless communication method.

In yet another embodiment, the communication unit 241 may include acellular communications transceiver for sending and receiving data overa cellular communications network such as via short messaging service(SMS), multimedia messaging service (MMS), hypertext transfer protocol(HTTP), direct data connection, WAP, e-mail or another suitable type ofelectronic communication. In still another embodiment, the communicationunit 241 may include a wired port and a wireless transceiver. Thecommunication unit 241 also provides other conventional connections tothe network 105 for distribution of files and/or media objects usingstandard network protocols such as TCP/IP, HTTP, HTTPS and SMTP as willbe understood to those skilled in the art.

The data storage 243 is a non-transitory memory that stores data forproviding the functionality described herein. The data storage 243 maybe a dynamic random access memory (DRAM) device, a static random accessmemory (SRAM) device, flash memory or some other memory devices. In someembodiments, the data storage 243 also may include a non-volatile memoryor similar permanent storage device and media including a hard diskdrive, a floppy disk drive, a CD-ROM device, a DVD-ROM device, a DVD-RAMdevice, a DVD-RW device, a flash memory device, or some other massstorage device for storing information on a more permanent basis.

In the illustrated embodiment, the data storage 243 is communicativelycoupled to the bus 220. The data storage 243 stores data for analyzing areceived image and results of the analysis and other functionality asdescribed herein. For example, the data storage 243 may store an imageoverlap threshold for capturing optimal overlapping images. The datastorage 243 may similarly store a captured image and the set of featuresdetermined for the captured image. Additionally, the data storage 243may store a stitched linear panoramic image. The data stored in the datastorage 243 is described below in more detail.

The orientation sensors 245 may be hardware-based or software-based, ora combination of hardware and software for determining position ormotion of the computing device 200. In some embodiments, the orientationsensors 245 may include an accelerometer, a gyroscope, a proximitysensor, a geomagnetic field sensor, etc. In different embodiments, theorientation sensors 245 may provide acceleration force data for thethree coordinate axes, rate of rotation data for the three coordinateaxes (e.g., yaw, pitch and roll values), proximity data indicating adistance of an object, etc. It should be noted that the orientationsensors 245 are shown in FIG. 2 with dashed lines to indicate it isoptional. For example, where the computing device 200 is the recognitionserver 101, the orientation sensors 245 are not part of the system,where the computing device 200 is the client device 115, the orientationsensors 245 are included and are used to provide sensor information forvarious motion or position determination events of the client device 200described herein.

The capture device 247 may be operable to capture an image or datadigitally of an object of interest. For example, the capture device 247may be a high definition (HD) camera, a regular 2D camera, amulti-spectral camera, a structured light 3D camera, a time-of-flight 3Dcamera, a stereo camera, a standard smartphone camera or a wearablecomputing device. The capture device 247 is coupled to the bus toprovide the images and other processed metadata to the processor 235,the memory 237 or the data storage 243. It should be noted that thecapture device 247 is shown in FIG. 2 with dashed lines to indicate itis optional. For example, where the computing device 200 is therecognition server 101, the capture device 247 is not part of thesystem, where the computing device 200 is the client device 115, thecapture device 247 is included and is used to provide images and othermetadata information described below with reference to FIGS. 7A, 7B,9A-15B, 17A-171, 18 and 22A-22F.

In some embodiments, the image recognition application 103 may include acontroller 201, a feature extraction module 203, an alignment module205, a user guidance module 207, a stitching module 209 and a userinterface module 211. The components of the image recognitionapplication 103 are communicatively coupled via the bus 220.

The controller 201 may include software and/or logic to control theoperation of the other components of the image recognition application103. The controller 201 controls the other components of the imagerecognition application 103 to perform the methods described below withreference to FIGS. 3-6. The controller 201 may also include softwareand/or logic to provide the functionality for handling communicationsbetween the image recognition application 103 and other components ofthe computing device 200 as well as between the components of the imagerecognition application 103. In some embodiments, the controller 201 canbe implemented using programmable or specialized hardware including afield-programmable gate array (FPGA) or an application-specificintegrated circuit (ASIC). In some embodiments, the controller 201 canbe implemented using a combination of hardware and software executableby processor 235. In some embodiments, the controller 201 is a set ofinstructions executable by the processor 235. In some implementations,the controller 201 is stored in the memory 237 and is accessible andexecutable by the processor 235. In some implementations, the controller201 is adapted for cooperation and communication with the processor 235,the memory 237 and other components of the image recognition application103 via the bus 220.

In some embodiments, the controller 201 sends and receives data, via thecommunication unit 241, to and from one or more of the client device 115and the recognition server 101. For example, the controller 201receives, via the communication unit 241, an image from a client device115 operated by a user and sends the image to the feature extractionmodule 203. In another example, the controller 201 receives data forproviding a graphical user interface to a user from the user interfacemodule 211 and sends the data to a client device 115, causing the clientdevice 115 to present the user interface to the user.

In some embodiments, the controller 201 receives data from othercomponents of the image recognition application 103 and stores the datain the data storage 243. For example, the controller 201 receives dataincluding features identified for an image from the feature extractionmodule 203 and stores the data in the data storage 243. In otherembodiments, the controller 201 retrieves data from the data storage 243and sends the data to other components of the image recognitionapplication 103. For example, the controller 201 retrieves dataincluding an overlap threshold from the data storage 243 and sends theretrieved data to the alignment module 205.

The feature extraction module 203 may include software and/or logic toprovide the functionality for receiving an image of an object ofinterest from the client device 115 and determining features for theimage. In some embodiments, the feature extraction module 203 can beimplemented using programmable or specialized hardware including afield-programmable gate array (FPGA) or an application-specificintegrated circuit (ASIC). In some embodiments, the feature extractionmodule 203 can be implemented using a combination of hardware andsoftware executable by processor 235. In some embodiments, the featureextraction module 203 is a set of instructions executable by theprocessor 235. In some implementations, the feature extraction module203 is stored in the memory 237 and is accessible and executable by theprocessor 235. In some implementations, the feature extraction module203 is adapted for cooperation and communication with the processor 235,the memory 237 and other components of the image recognition application103 via the bus 220.

In some embodiments, the feature extraction module 203 receives an imageand determine features for the image. In some embodiments, the featureextraction module 203 receives a preview image of an object of interestfrom the alignment module 205 and determines a set of features for theimage. For example, the feature extraction module 203 may determine alocation, an orientation, and an image descriptor for each featureidentified in the image. In some embodiments, the feature extractionmodule 203 uses corner detection algorithms such as, Shi-Tomasi cornerdetection algorithm, Harris and Stephens corner detection algorithm,etc., for determining feature location. In some embodiments, the featureextraction module 203 uses Binary Robust Independent Elementary Features(BRIEF) descriptor approach for determining efficient image featuredescriptors. In some embodiments, the feature extraction module 203sends the set of features for the images to the alignment module 205. Inother embodiments, the feature extraction module 203 identifies theimage as a reference image and stores the set of features in the datastorage 243.

The alignment module 205 may include software and/or logic to providethe functionality for receiving a preview image of an object of interestfrom the client device 115 for realignment with a reference image,instructing the user interface module 211 to generate a user interfaceincluding the preview image and/or dynamically comparing features of thereference image and a preview image of an object of interest. In someembodiments, the alignment module 205 can be implemented usingprogrammable or specialized hardware including a field-programmable gatearray (FPGA) or an application-specific integrated circuit (ASIC). Insome embodiments, the alignment module 205 can be implemented using acombination of hardware and software executable by processor 235. Insome embodiments, the alignment module 205 is a set of instructionsexecutable by the processor 235. In some implementations, the alignmentmodule 205 is stored in the memory 237 and is accessible and executableby the processor 235. In some implementations, the alignment module 205is adapted for cooperation and communication with the processor 235, thememory 237 and other components of the image recognition application 103via the bus 220.

In some embodiments, the alignment module 205 continuously receivespreview images of an object of interest sampled by the capture device247 and sends the preview images to the feature extraction module 203.In other embodiments, the alignment module 205 instructs the userinterface module 211 to generate a user interface for displaying thepreview image on a display of the client device 115. In someembodiments, the alignment module 205 may receive a user selection forrealignment of images on the client device 115. In some embodiments, thealignment module 205 receives features for the preview images from thefeature extraction module 203 and dynamically compares the features ofthe reference image against the features of the preview images. In someembodiments, the alignment module 205 determines an overlap betweenimages and instructs the user interface module 211 for generatingvisually distinct indicators on a user interface for guiding a movementof the client device 115 to produce a desired overlap. In otherembodiments, the alignment module 205 determines whether the overlapsatisfies a predetermined overlap threshold and sends instructions tothe feature extraction module 203 to set the preview image as thereference image based on the predetermined overlap threshold beingsatisfied.

The user guidance module 207 may include software and/or logic toprovide the functionality for guiding a movement of the client device115 in a direction, guiding an orientation of the client device 115 inan axis of orientation and providing progress information throughvisually distinct indicators. In some embodiments, the user guidancemodule 207 can be implemented using programmable or specialized hardwareincluding a field-programmable gate array (FPGA) or anapplication-specific integrated circuit (ASIC). In some embodiments, theuser guidance module 207 can be implemented using a combination ofhardware and software executable by processor 235. In some embodiments,the user guidance module 207 is a set of instructions executable by theprocessor 235. In some implementations, the user guidance module 207 isstored in the memory 237 and is accessible and executable by theprocessor 235. In some implementations, the user guidance module 207 isadapted for cooperation and communication with the processor 235, thememory 237 and other components of the image recognition application 103via the bus 220.

In some embodiments, the user guidance module 207 receives gyroscopesensor information from the orientation sensors 245 of the client device115. In some embodiments, the user guidance module 207 determineswhether the client device 115 is tilting in one of the three axes oforientation based on the gyroscope sensor information. In otherembodiments, the user guidance module 207 sends instructions to the userinterface module 211 for generating visually distinct indicators on auser interface for guiding an orientation of the client device 115 tonullify the tilt. In some embodiments, the user guidance module 207receives a selection of a pattern of image capture for receiving imagesof an object of interest from a client device 115. In some embodiments,the user guidance module 207 sends instructions to the user interfacemodule 211 for generating visually distinct indicators for directionalmovement of the client device based on the selected pattern of imagecapture. In other embodiments, the user guidance module 207 sendsinstructions to the user interface module 211 for generating a mosaicpreview of images received for an object of interest on the userinterface.

The stitching module 209 may include software and/or logic to providethe functionality for stitching a series of images into a single linearpanoramic image. In some embodiments, the stitching module 209 can beimplemented using programmable or specialized hardware including afield-programmable gate array (FPGA) or an application-specificintegrated circuit (ASIC). In some embodiments, the stitching module 209can be implemented using a combination of hardware and softwareexecutable by processor 235. In some embodiments, the stitching module209 is a set of instructions executable by the processor 235. In someimplementations, the stitching module 209 is stored in the memory 237and is accessible and executable by the processor 235. In someimplementations, the stitching module 209 is adapted for cooperation andcommunication with the processor 235, the memory 237 and othercomponents of the image recognition application 103 via the bus 220.

In some embodiments, the stitching module 209 receives the referenceimages of the object of interest from the feature extraction module 203.In some embodiments, the stitching module 209 receives overlapinformation between the images being processed by the alignment module205. In some embodiments, where the computing device 200 is the clientdevice 115, the stitching module 209 of the image recognitionapplication 103 sends the reference images of the object of interest,overlap information and other metadata information to the recognitionserver 101 for generating a single linear panoramic image. In someembodiments, where the computing device 200 is the recognition server101, the stitching module 209 of the image recognition application 103generates the single linear panoramic image using the reference imagesof the object of interest, overlap information and other metadatainformation. In other embodiments, the stitching module 209 receives thelinear panoramic image, stores the linear panoramic image in the datastorage 243 and instructs the user interface module 211 to generate auser interface for displaying the linear panoramic image.

The user interface module 211 may include software and/or logic forproviding user interfaces to a user. In some embodiments, the userinterface module 211 can be implemented using programmable orspecialized hardware including a field-programmable gate array (FPGA) oran application-specific integrated circuit (ASIC). In some embodiments,the user interface module 211 can be implemented using a combination ofhardware and software executable by processor 235. In some embodiments,the user interface module 211 is a set of instructions executable by theprocessor 235. In some implementations, the user interface module 211 isstored in the memory 237 and is accessible and executable by theprocessor 235. In some implementations, the user interface module 211 isadapted for cooperation and communication with the processor 235, thememory 237 and other components of the image recognition application 103via the bus 220.

In some embodiments, the user interface module 211 receives instructionsfrom the alignment module 205 to generate a graphical user interfacethat instructs the user on how to move the client device 115 to capturea next image that has a good overlap with the previously captured image.In some embodiments, the user interface module 211 receives instructionsfrom the user guidance module 207 to generate a graphical user interfacethat guides the user to capture an overlapping image with little to notilt in any of the axes of orientations (e.g., X, Y, or Z axis). Inother embodiments, the user interface module 211 sends graphical userinterface data to an application (e.g., a browser) in the client device115 via the communication unit 241 causing the application to displaythe data as a graphical user interface.

Methods

FIG. 3 is a flow diagram illustrating one embodiment of a method 300 forcapturing a series of images of an object of interest under guidance ofdirection for a single linear panoramic image. At 302, the featureextraction module 203 receives an image of a portion of an object ofinterest from a client device 115 for serving as a reference image. Forexample, the image can be an image of a shelf, a region, an artwork, alandmark, a scenic location, outer space, etc. The image is processedand assuming it satisfies the criteria (location, orientation andalignment) for being the first image in the series of images needed toform the single linear panoramic image, it is identified as thereference image. At 304, the alignment module 205 determines whetherthere are preview images being sampled by the client device 115. If thepreview images are being sampled, at 306, the alignment module 205receives a preview image of another portion of the object of interestfrom the client device 115. At 308, the user interface module 211generates a user interface including a visually distinct indicatoroverlaid upon the preview image, the visually distinct indicatoridentifying a direction for guiding a movement of the client device. Forexample, the direction of movement can be in a north, south, east, orwest direction. At 310, the user interface module 211 adds to the userinterface a progressively growing mosaic preview, the mosaic previewincluding a thumbnail representation of the reference image. At 312, thealignment module 205 compares dynamically the reference image with thepreview image to determine whether an overlap being detected between thereference image and the preview image satisfies a predetermined overlapthreshold. For example, the predetermined overlap threshold can be setat 60 percent. At 314, the alignment module 205 checks whether theoverlap threshold is satisfied. If the overlap threshold is satisfied,at 316, the feature extraction module 203 sets the preview image to bethe reference image and the method 300 repeats the process from step304. If the overlap threshold is not satisfied, the method 300 repeatsthe process from step 304. More images are received as preview images onthe display of the capture device and the user interface is continuouslyupdated until a preview image with sufficient overlap with the referenceimage is determined. If the preview images are not being sampled by theclient device 115, then at 318, the stitching module 209 sends theimages of the portion of the object of interest to generate a singlelinear panoramic image. In some embodiments, the alignment module 205 isresponsive to user input and once the user stops providing previewimages, the alignment module 205 sends an instruction to the stitchingmodule 209. The stitching module 209 sends the images of the object ofinterest to the recognition server 101 for the panoramic image to begenerated. In some embodiments, the stitching module 209 providesfeedback to the user as to whether enough images have been captured toform a panoramic image. In some embodiments, the user guidance module207 may receive input as to the pattern of image capture and the userguidance module 207 may instruct the user interface module 211 togenerate a user interface to guide the user as to the next image topreview or provide. In other words, the method may provide the useradditional feedback as to what vertical and lateral movement to make toprovide previews of images.

FIGS. 4A-4B are flow diagrams illustrating one embodiment of a method400 for capturing a series of images of an object of interest in adirectionally guided pattern for generating a single linear panoramicimage. At 402, the user guidance module 207 receives a selection of aserpentine pattern of image capture for receiving images of an object ofinterest from a client device. At 404, the feature extraction module 203receives an image of a portion of the object of interest from the clientdevice and identifies the image as a reference image. At 406, thefeature extraction module 203 determines features of the referenceimage. For example, the feature extraction module 203 determines animage descriptor for each feature identified for the reference image.The feature extraction module 203 uses Binary Robust IndependentElementary Features (BRIEF) descriptor approach for determiningefficient image feature descriptors. The image descriptor can be a256-bit bitmask which describes the image sub-region covered by thefeature.

At 408, the user guidance module 207 checks whether a lateral directionfor the serpentine pattern is known. If the lateral direction for theserpentine pattern is known, at 414, the alignment module 205 determineswhether there are preview images being sampled by the client device 115.For example, the current preview image can be the live preview generatedon a display screen of the client device 115 by continuously receivingthe image formed on the lens and processed by the image sensor includedwithin the client device 115. If the preview images are being sampled,at 416, the alignment module 205 receives a preview image of anotherportion of the object of interest from the client device 115. At 418,the user interface module 211 generates a user interface including thepreview image on a display of the client device. At 420, the userinterface module 211 adds to the user interface a visually distinctindicator identifying a direction for guiding a movement of the clientdevice for receiving additional preview images from the client device115 according to the serpentine pattern. For example, the visuallydistinct indicator can be a directional arrow pointing in east, west,north or south on the user interface. At 422, the alignment module 205compares dynamically the features of the reference image with thepreview image to determine whether a desired overlap between thereference image and the preview image satisfies a predetermined overlapthreshold. For example, the alignment module 205 uses Hamming distanceto compare image descriptors (i.e., 256-bit bitmasks) of the features ofthe reference image and the preview image of the object of interest todetermine the overlap. At 424, the alignment module 205 checks whetherthe overlap threshold is satisfied. If the overlap threshold issatisfied, at 426, the feature extraction module 203 sets the previewimage to be the reference image and the method 400 repeats the processfrom step 406. If the overlap threshold is not satisfied, the method 400repeats the process from step 414.

If the lateral direction for the serpentine pattern is not known, thenat 410, the user guidance module 207 checks whether the reference imageis identified lateral to a previous reference image. If the referenceimage is identified lateral to the previous reference image, then at412, the user guidance module 207 identifies the lateral direction ofthe serpentine pattern for guiding the client device 115 linearly acrossthe object of interest and the method 400 proceeds to execute step 414.For example, if a subsequent image is captured to the left of apreviously captured image, the user guidance module 207 determines thatthe lateral direction of the serpentine pattern is a right-to-leftserpentine pattern for capturing images linearly across the object ofinterest. If the reference image is not lateral to the previousreference image, then the method 400 proceeds to execute step 414. At414, the alignment module 205 determines whether there are previewimages being sampled by the client device 115. If the preview images arenot being sampled by the client device 115, then at 428, the stitchingmodule 209 sends the images of the portions of the object of interest togenerate a single linear panoramic image.

FIGS. 5A-5B are flow diagrams illustrating another embodiment of amethod 500 for capturing a series of images of an object of interest ina directionally guided pattern for generating a single linear panoramicimage. At 502, the user guidance module 207 receives a selection of aserpentine pattern of image capture for receiving images of an object ofinterest from a client device. At 504, the feature extraction module 203receives an image of a portion of the object of interest from the clientdevice and identifies the image as a reference image. For example, theimage can be an image of a shelf, a region, an artwork, a landmark, ascenic location, outer space, etc. For example, the direction ofmovement can be in a north, south, east, or west direction. At 506, thefeature extraction module 203 determines features of the referenceimage. At 508, the user guidance module 207 checks whether a lateraldirection for the serpentine pattern is known.

If the lateral direction for the serpentine pattern is known, at 516,the alignment module 205 determines whether there are preview imagesbeing sampled by the client device 115. If the preview images are beingsampled, at 518, the alignment module 205 receives a preview image ofanother portion of the object of interest from the client device 115. At520, the user interface module 211 generates a user interface includingthe preview image and a progressively growing mosaic preview on adisplay of the client device. For example, the mosaic preview providesprogress information relating to the images received for the object ofinterest so far. At 522, the user interface module 211 adds to themosaic preview a thumbnail representation of the reference image andidentifies at least one location in the mosaic preview where asubsequent image of the object of interest is to be placed according tothe serpentine pattern. At 524, the alignment module 205 comparesdynamically the features of the reference image with the preview imageto determine whether a desired overlap between the reference image andthe preview image satisfies a predetermined overlap threshold. At 526,the alignment module 205 checks whether the overlap threshold issatisfied. If the overlap threshold is satisfied, at 528, the featureextraction module 203 sets the preview image to be the reference imageand the method 500 repeats the process from step 506. If the overlapthreshold is not satisfied, the method 500 repeats the process from step516.

If the lateral direction for the serpentine pattern is not known, thenat 510, the user guidance module 207 checks whether the reference imageis identified lateral to a previous reference image. If the referenceimage is lateral to the previous reference image, then at 512, the userguidance module 207 identifies the lateral direction of the serpentinepattern for guiding the client device 115 linearly across the object ofinterest. At 514, the user guidance module 207 slides the progressivelygrowing mosaic preview in an opposite direction on the user interfaceand method 500 proceeds to execute step 516. For example, the mosaicpreview is slid to the left on the user interface if the lateraldirection of the serpentine pattern of image capture is a left-to-rightdirection. If the reference image is not lateral to the previousreference image, then the method 500 proceeds to execute step 516. At516, the alignment module 205 determines whether there are previewimages being sampled by the client device 115. If the preview images arenot being sampled by the client device 115, then at 530, the stitchingmodule 209 sends the images of the portions of the object of interest togenerate a single linear panoramic image.

FIGS. 6A-6B are flow diagrams illustrating one embodiment of a method600 for realigning the current preview image with a previously capturedimage of an object of interest. At 602, the feature extraction module203 receives an image of a portion of an object of interest from aclient device 115. At 604, the alignment module 205 determines whetherrealignment is needed. For example, the alignment module 205 may receivea user input to realign a preview image on the client device 115 withthe previously captured image. If realignment is not needed, then themethod 600 ends. If realignment is needed, then at 606, the featureextraction module 203 identifies the image as a ghost image anddetermines features of the ghost image. At 608, the alignment module 205determines whether there are preview images being sampled by the clientdevice 115. If the preview images are not being sampled, the method 600ends. If the preview images are being sampled, then at 610, thealignment module 205 receives a preview image of another portion of theobject of interest from the client device 115. At 612, the userinterface module 211 generates a user interface overlaying the ghostimage as a semi-transparent mask on top of the preview image on adisplay of the client device 115. At 614, the alignment module 205compares dynamically the features of the ghost image with the previewimage of the object of interest to determine a realignment between theghost image and the preview image. At 616, the user interface module 211adds to the user interface a visually distinct indicator overlaid uponthe preview image for guiding a movement of the client device 115 toproduce a desired realignment. At 618, the user interface module 211updates a position of the visually distinct indicator relative to atarget outline at a center of the preview image in the user interfacebased on the dynamic comparison, the position of the visually distinctindicator inside the target outline indicating that the realignment issuccessful. At 620, the alignment module 205 checks whether therealignment is successful. If the realignment is successful, at 622, theuser interface module 211 updates the user interface to indicate therealignment is successful. If the realignment is not successful, themethod 600 repeats the process from step 608.

User Interfaces

In some embodiments, the alignment module 205 receives a request from auser of the client device 115 to capture an image of an object ofinterest. For example, the image can be an image of a shelf, a region,an artwork, a landmark, a scenic location, outer space, etc. In someembodiments, the alignment module 205 instructs the user interfacemodule 211 to generate a user interface for including a preview image ofthe object of interest on a display of the client device 115. Thefeature extraction module 203 receives the image captured by the clientdevice 115 and extracts a set of features for the image. As shown in theexample of FIG. 7A, the graphical representation illustrates anembodiment of the user interface 700 for capturing an image of a shelf.For example, the image of the shelf captures a state of the shelf at aretail store. The user interface 700 in the graphical representationincludes a frame 701 defined by four corner markers 702 for aligning theclient device 115 with the shelf for image capture, a pair of targetoutlines 703 and 704 of a concentric circles for centering the shelf atthe middle of the display, a gyro horizon line 705 and a pair oftilt-reference arrows 709 a-709 b and 711 a-711 b on the periphery forindicating whether a preview image 707 of the shelf is off-center and/ortilting before capturing the image. The thin straight line 715connecting the tilt reference arrows 709 a-709 b may move laterally leftand right in unison with the tilt-reference arrows 709 a-709 b toindicate a tilting of the client device 115 in an axis of orientation.The thin straight line 717 connecting the tilt-reference arrows 711a-711 b may move up and down in unison with the tilt-reference arrows711 a-711 b to indicate a tilting of the client device 115 in anotheraxis of orientation. The outer target outline 704 may include a pair oftilt-reference arrows 713 a-713 b that provides the same functionalityof the tilt-reference arrows 709 a-709 b but in a different way. Inanother example, as shown in FIG. 7B, the graphical representationillustrates another embodiment of the user interface 750 for capturingan image of a shelf. The user interface 750 in the graphicalrepresentation is minimalistic. The tilt-reference arrows 709 a-709 bfrom FIG. 7A are discarded in FIG. 7B. The tilt-reference arrows 713a-713 b placed inside the outer target outline 704 are made use ofinstead. The tilt-reference arrows 709 a-709 b in conjunction with thegyro horizon line 705 may indicate whether the preview image 707 of theshelf is off-center and/or tilting. For example, the tilt-referencearrows 709 a-709 b and the gyro horizon line 705 may rotateclockwise/anti-clockwise depending on a direction in which the clientdevice 115 is rolling about the Z axis. The image of the shelf may bereceived for recognition and may include multiple items of interest. Forexample, the image can be an image of packaged products on a shelf(e.g., coffee packages, breakfast cereal boxes, soda bottles, etc.) in aretail store. The packaged product may include textual and pictorialinformation printed on its surface that distinguishes it from otheritems on the shelf. In one example, the display of the client device 115may flash to indicate that the image was captured in response to theuser tapping the screen.

In some embodiments, the feature extraction module 203 receives an imageof a portion of an object of interest from the client device 115,extracts a set of features from the image and sends the set of featuresto the alignment module 205. The set of features extracted may be robustto variations in scale, rotation, ambient lighting, image acquisitionparameters, etc. The feature extraction module 203 locates each featurein the set of features and determines a location, an orientation, and animage descriptor for each feature. The location may be a relativelocation to a point in the image (e.g., the location of one identifiedfeature) where each feature occurs. In some embodiments, the featureextraction module 203 uses corner detection algorithms such as,Shi-Tomasi corner detection algorithm, Harris and Stephens cornerdetection algorithm, etc., for determining feature location. In someembodiments, the feature extraction module 203 uses Binary RobustIndependent Elementary Features (BRIEF) descriptor approach fordetermining efficient image feature descriptors. An image descriptor ofa feature may be a 256-bit bitmask which describes the image sub-regioncovered by the feature. In some embodiments, the feature extractionmodule 203 may compare each pair of 256 pixel pairs near the feature forintensity and based on each comparison, the feature extraction module203 may set or clear one bit in the 256-bit bitmask. In someembodiments, the feature extraction module 203 determines whether thereceived image is optimal for image recognition and instructs the userinterface module 211 to generate data for instructing the user to retakethe image if a section of the image taken has limited information forcomplete recognition (e.g., a feature rich portion is cut off), theimage is too blurry, the image has an illumination artifact (e.g.,excessive reflection), etc. In some embodiments, the feature extractionmodule 203 identifies the image captured by the client device 115 as areference image and stores the set of identified features for thereference image in a cache. For example, the feature extraction module203 processes the image and determines whether it satisfies the criteria(location, orientation and alignment) for being the first image in theseries of images needed to form the single linear panoramic image. If itdoes, then the feature extraction module 203 identifies the image as areference image. In other embodiments, the feature extraction module 203sends the image captured by the client device 115 to the stitchingmodule 209. In other embodiments, the feature extraction module 203receives the preview images of an object of interest from the alignmentmodule 205, extracts a set of features from the preview image in realtime and sends the set of features to the alignment module 205.

For purposes of creating a linear panoramic image using a series ofimages, the user may move the client device 115 in any direction alongthe object of interest while remaining parallel to an object of interestfor capturing subsequent images following a first image. For example,the user carrying the client device 115 can move in a north, south,east, or west direction from one point of location to another whileremaining parallel to the shelving unit for capturing other images inthe series. The images needed for creating the linear panoramic image ofa lengthy shelving unit cannot be captured by the user of the clientdevice 115 by remaining stationary at a fixed point of location. This isbecause, from a fixed point of location, the user can merely pivotvertically or horizontally for capturing surrounding images that connectto the first image. If the images of the shelf were to be captured insuch a manner, the images cannot be stitched together without producingstrange artifacts in the panoramic image at locations where two imagesare stitched together. In some embodiments, the user guidance module 207receives a user selection of a pattern of image capture for capturingthe series of images. The user guidance module 207 instructs the userinterface module 211 to provide guidance to the user via the clientdevice 115 on how to capture a next image in the series of images basedon the selected pattern of image capture.

In some embodiments, the selected pattern of image capture may be aserpentine scan pattern. In the serpentine scan pattern, the sequence inimage capture may alternate between the top and the bottom (or betweenthe left and the right) while the client device 115 is moving parallelto the object of interest in a horizontal direction (or a verticaldirection). The user guidance module 207 instructs the user interfacemodule 211 to generate a user interface for guiding a movement of theclient device 115 by the user based on the serpentine scan pattern. Forexample, the user interface may indicate that the client device 115 maymove first down (or up) the object of interest, then to move to theright (or left) of the object of interest, then to move up (or down) theobject of interest, then to move to the right (or left) of the object ofinterest, and again to move down (or up) the object of interest, inorder to follow the serpentine scan pattern. The feature extractionmodule 203 receives an image of the object of interest captured by theclient device 115 at the end of each movement.

In some embodiments, the selected pattern of image capture may be araster scan pattern. The raster scan pattern covers the image capture ofthe object of interest by moving the client device 115 progressivelyalong the object of interest, one line at a time. The user guidancemodule 207 instructs the user interface module 211 to generate a userinterface for guiding a movement of the client device 115 by the userbased on the raster scan pattern. For example, the user interface mayindicate that the client device 115 may move from left-to-right (orright-to-left) of the object of interest in a line, then move down (orup) the object of interest at the end of line and start again fromleft-to-right (or right-to-left) of the object of interest in a nextline, in order to follow the raster scan pattern. The feature extractionmodule 203 receives an image of the object of interest captured by theclient device 115 at the end of each movement of the client device 115from left-to-right (or right-to-left).

In other embodiments, the selected pattern of image capture may be anover-and-back scan pattern. The over-and-back scan pattern covers theimage capture of the object of interest by moving the client device 115over a portion of the object of interest in a horizontal (or vertical)direction to one end and then moving the client device 115 back tocapture another portion of the object of interest that was not covered.The user guidance module 207 instructs the user interface module 211 togenerate a user interface for guiding a movement of the client device115 by the user based on the over-and-back scan pattern. For example,the user interface may indicate that the client device 115 may move fromleft-to-right (or right-to-left) of the object of interest to one end,then move down (or up) the object of interest, and to move fromright-to-left (or left-to-right) back to the starting end, in order tofollow the over and back scan pattern. The feature extraction module 203receives an image of the object of interest captured by the clientdevice 115 at the end of each movement of the client device 115 fromleft-to-right to one end and at the end of each movement of the clientdevice 115 from right-to-left and back to the starting end.

As shown in the example of FIG. 8, the graphical representation 800illustrates one embodiment of an overlap between images captured of anobject of interest. The graphical representation 800 includes a firstcaptured image 801 and a second captured image 803 of a shelving unit805 in a retail store. The shelving unit 805 is stocked with consumerproducts. The graphical representation 800 illustrates the overlap 807between the first captured image 801 and the second image 803. In someembodiments, the alignment module 205 instructs the user interfacemodule 211 to generate a user interface to guide movement of the clientdevice 115 to capture a next image in the series of images that isoverlapping with a previously captured image of the object of interestby a certain amount. The overlap may be computed in either thehorizontal or vertical direction depending on which direction the usercarrying the capture device moves the client device 115. This overlapmay be a threshold amount of overlap (e.g., approximately 60%) betweenthe images expected by a stitching algorithm used for creating thelinear panorama by stitching together each of the individually capturedimages in the series. In some embodiments, the image overlap thresholdvalue may be tuned based on the stitching algorithm used by therecognition server 101. For example, the stitching algorithm may be theStitcher class included in the Open Source Computer Vision (OpenCV)package, where feature finding and description algorithms supporting theStitcher class can be one or more from a group of Binary RobustInvariant Scalable Keypoints (BRISK) algorithm, Fast Retina Keypoint(FREAK) algorithm, Oriented FAST and Rotated BRIEF (ORB) algorithm, etc.In some embodiments, the image overlap threshold value may be otherpercentages. In some embodiments, the image overlap threshold value mayhave a range between 55% and 65%. As such, the client device 115 maytune parameters for capturing images that are compatible and improve theperformance of the stitching algorithm

In some embodiments, the alignment module 205 continuously receives thecurrent preview image of a portion of the object of interest asdisplayed by the client device 115 when the client device 115 ispointing at the object of interest. The current preview image can be thelive preview generated on a display screen of the client device 115 bycontinuously receiving the image formed on the lens and processed by theimage sensor included within the client device 115. In some embodiments,the alignment module 205 sends the preview images for the object ofinterest that are being received continuously from the client device 115to the feature extraction module 203 for extracting the image features.For example, the feature extraction module 203 determines image featuresfor the images in the camera preview as the client device 115 movesalong the object of interest.

In some embodiments, the alignment module 205 dynamically compares theidentified features of a previously captured image of the object ofinterest with the features of the current preview image being displayedby the client device 115. The alignment module 205 identifiesdistinctive features in the previously captured image and thenefficiently matches them to the features of the current preview image toquickly establish a correspondence between the pair of images. Forexample, if the variable ‘i’ can be used to represent the most recent,previously captured image, then the image feature set may be representedas F_(i), and therefore the set of image features for the current imagein the image pipeline may be represented by F_(i+1). The set of imagefeatures for the very first image in the sequence may be represented asF₀. In some embodiments, the alignment module 205 determines asimilarity function to compare the previously captured image F_(i) tothe current preview image F_(i+1) to generate a similarity measureS_(i). For example, the formula may be stated as sim (F_(i),F_(i+1))=S_(i). The value S_(i) represents the amount of similaritybetween the previously captured image F_(i) and the current previewimage F_(i+1).

In some embodiments, the alignment module 205 uses the image overlapthreshold as a parameter along with the dynamic feature comparisonbetween the current preview image and the previously captured image forproviding guidance and/or feedback to the user via a user interface onthe client device 115. For example, the alignment module 205 uses theimage overlap threshold to set a similarity value ‘V’ at 0.6. In someembodiments, the alignment module 205 may receive data includingmovement of the client device 115 from the orientation sensors 245 whenthe user moves the client device 115 in one of the directions (e.g.,north, south, east or west) parallel to the object of interest aftercapturing the previous image. In some embodiments, the alignment module205 determines a direction of movement of the client device 115 based onthe dynamic feature comparison between the previously captured image ofthe object of interest and the current preview image as displayed by theclient device 115. The dynamic feature comparison between the previouslycaptured image and the current preview image determines an extent of theimage differentiation. The alignment module 205 determines whether thereis an existing overlap between the previously captured image and thecurrent preview image in the direction of movement of the client device115 and whether the existing overlap is approaching a predeterminedimage overlap threshold when the client device 115 is moving in thedirection of movement. The alignment module 205 instructs the userinterface module 211 to generate a visually distinct indicator foroverlap on the user interface responsive to the determined overlap inthe direction of the movement of the client device 115. The visuallydistinct indicator for overlap may be overlaid upon the preview imagedisplayed by the client device 115. The visually distinct indicator foroverlap can be visually distinct by one or more from the group of ashape, a size, a color, a position, an orientation, and shading.

The alignment module 205 couples the position of the visually distinctindicator for overlap on the user interface with the direction ofmovement of the client device 115. For example, if the user carrying theclient device 115 is moving from left-to-right, the visually distinctindicator for overlap may initially appear on the right side of thedisplay and begin to move to the left side based on the dynamic featurecomparison. In another example, if the user carrying the client device115 is moving from right-to-left, the visually distinct indicator foroverlap may initially appear on the left side of the display and beginto move to the right side based on the dynamic feature comparison. Thealignment module 205 continues to dynamically compare the identifiedfeatures of the previously captured image of the object of interest withthe features of the current preview image in the direction of movementof the client device 115. The alignment module 205 translates thedynamic comparison data in the direction of movement into changing theposition of the visually distinct indicator on the user interface whichprovides the user with instantaneous feedback on how to move the clientdevice 115 to achieve an optimal overlap satisfying the predeterminedoverlap threshold. For example, if the overlap between the previouslycaptured image and the current preview image corresponds to apredetermined image overlap threshold (i.e., similarity value ‘V’=60%)in a direction of movement, then the position of the visually distinctindicator for overlap changes on the user interface to indicate thatsuch a condition has been met. The visually distinct indicator foroverlap may move into a bounded target outline of a geometric shape suchas, a circle, a square, or a polygon overlaid upon the preview image atthe center of the display of the client device 115 to illustrate thecondition has been met for optimal overlap. In some embodiments, thealignment module 205 uses a tolerance value ‘T’ along with similarityvalue ‘V’ to compute when the visually distinct indicator for overlap iswithin range, for example, inside the geometric shape. In someembodiments, the alignment module 205 uses the tolerance value ‘T’ toallow a bit of fuzziness with respect to how much of the visuallydistinct indicator for overlap needs to be inside of the geometric shapebefore the image may be captured. In other words, the visually distinctindicator can be partially within the geometric shape and partiallyoutside the geometric shape. The visually distinct indicator may notneed to fit exactly within the geometric shape before the image can becaptured. In some embodiments, the alignment module 205 instructs theuser interface module 211 to generate a progress status bar on the userinterface to indicate an extent of overlap occurring between thepreviously captured image and the current preview image until the imageoverlap threshold is met. For example, the progress status bar may showincremental progress in achieving the overlap. In other embodiments, thealignment module 205 sends a capture command to the client device 115 tocapture the image responsive to the overlap satisfying the image overlapthreshold, receives the image from the client device 115 and sends theimage to the feature extraction module 203.

In some embodiments, the alignment module 205 determines a distancemeasure function along with the similarity function for sendinginstructions to the user interface module 211. For example, theinstructions to the user interface module 211 may be instructions thatdrive the user interface for displaying the visually distinct indicatorfor overlap and determine when to capture the image. The distancemeasure function represents a sum of all similarity measures ‘S’determined thus far, from image F₀ (i.e., S₀) to image F_(i) (i.e.,S_(i)) and may be represented as dist (S_(i)). The distance measurefunction determines how close the two images F₀ and F_(i) are to eachother. The alignment module 205 determines whether the similaritymeasure S_(i) is within the tolerance value ‘T’ of similarity value ‘V’such that the condition (V−T)<dist (S_(i))<(V+T) is satisfied. If it issatisfied, then the alignment module 205 sends a capture command to theclient device 115 to capture the image. As the distance measure functiondist (S_(i)) approaches to being within the tolerance value ‘T’, thealignment module 205 uses a value produced by the distance measurefunction dist (S_(i)) to represent the visually distinct indicator foroverlap getting closer to the geometric shape to fit within the boundedregion of the geometric shape on the user interface. For example, thismay translate into the visually distinct indicator for overlap appearingless and less transparent on the user interface of the client device115.

As shown in the example of FIG. 9, the graphical representation 900illustrates an embodiment of the image matching process for generatingthe visually distinct indicator for overlap. In FIG. 9, the graphicalrepresentation 900 includes a camera preview frames 902 for changingimage frames (F₁ to F₄) based on the user moving the client device 115and receiving preview images on the display of the client device 115.The graphical representation 900 also includes a similarity measurefunction 904 computed for every two image frames 902 and a distancemeasure function 906 computed for images frames 902 that have beenreceived so far.

As shown in the example of FIGS. 10A-10D, the graphical representationsillustrate embodiments of the user interface displaying a visuallydistinct indicator for overlap when the client device 115 moves in aleft-to-right direction. In FIG. 10A, the graphical representationillustrates a user interface 1000 that includes a ball 1001 (shadedcircle) and a pair of target outlines 1003 and 1003 of concentriccircles over a current preview image 1005 of the shelf as displayed onthe client device 115. The ball 1001 serves as the visually distinctindicator for overlap and initially appears transparent and at the rightedge of the display on the user interface 1000 because of an overlapstarting to occur as the client device 115 is being moved fromleft-to-right of the shelf. The inner target outline 1003 of a circleserves as a target boundary region within which the ball 1001 may bepositioned. In some embodiments, the ball 1001 and the pair of targetoutlines 1003 and 1003 can be customized to be of any color, shading,transparency, orientation, shape, symbol, etc. The aim for the user isto align and position the ball 1001 within the inner target outline 1003on the user interface 1000 by moving the client device 115 fromleft-to-right of the shelf in order to capture an overlapping imagebeing continuously previewed on the display. The alignment of the ball1001 within the outer target outline 1003 but outside of the innertarget outline 1003 signifies that the overlap is good but not enough.The alignment of the ball 1001 within the inner target outline 1003signifies that the overlap between the current preview image 1005 and apreviously captured image is enough to satisfy the image overlapthreshold for capturing a next image. In FIGS. 10B and 10C, therespective graphical representations illustrate an updated userinterfaces 1030 and 1060 that display the ball 1001 moving closer to theinner target outline 1003 and appearing less and less transparent incolor to indicate the desired overlap being produced. In otherembodiments, the appearance of the ball 1001 could be changed tovisually indicate the degree of the overlap. For example, the ball 1001may change color, shape, transparency, shading, orientation, etc. Theposition of the ball 1001, as it is getting closer and closer to theinner target outline 1003, indicates a progress associated withattaining the overlap between the current preview image 1005 and apreviously captured image that corresponds to the image overlapthreshold. In FIG. 10D, the graphical representation illustrates theuser interface 1090 updated to display the ball 1001 centered within theinner target outline 1003 in a solid, non-transparent color. Thisindicates to the user that the image overlap threshold condition issatisfied for capturing the image. The satisfaction of the overlapthreshold could be shown in various other ways by showing the ball 1001in a visually distinct manner from its prior state such as, flashing,flashing in a different color, a change in shape (e.g., triangle,pentagon, etc.), a change in fill, etc. In some embodiments, the userinterface 1090 may flash briefly with an audible shutter clicking soundon the client device 115 to indicate that the image has been captured.In FIG. 10D, the user interface 1090 may be reset and ball 1001 maydisappear from the user interface 1090 after the image has been captureduntil the client device 115 starts to move again in one of thedirections over the shelf.

In another example of FIGS. 11A-11D, the graphical representationsillustrate embodiments of displaying a visually distinct indicator foroverlap when the client device 115 moves in a bottom to top direction.In FIG. 11A, the graphical representation illustrates a user interface1100 that includes a ball 1101 and a pair of target outlines 1103 and1104 of concentric circles over a current preview image 1105 of theshelf as displayed on the client device 115. The ball 1101 serves as thevisually distinct indicator for overlap and initially appearstransparent and at the top edge of the display on the user interface1100 because of an overlap starting to occur as the client device 115 isbeing moved from bottom to top of the shelf. The aim for the user is toalign and position the ball 1101 within the inner target outline 1103 onthe user interface 1100 by moving the client device 115 from bottom totop of the shelf in order to capture an overlapping image beingpreviewed on the display. The alignment of the ball 1101 within theinner target outline 1103 signifies that the overlap between the currentpreview image 1105 and a previously captured image satisfies the imageoverlap threshold for capturing a next image. In FIGS. 11B and 11C, therespective graphical representations illustrate an updated userinterfaces 1130 and 1160 that displays the ball 1101 moving closer tothe inner target outline 1103 and appearing less and less transparent incolor. The position of the ball 1101, as it is getting closer and closerto the inner target outline 1103, indicates a progress associated withattaining the overlap between the current preview image 1105 and apreviously captured image that corresponds to the image overlapthreshold. In FIG. 11D, the graphical representation illustrates theuser interface 1190 updated to display the ball 1101 centered within thetarget outline 1103 in a solid, non-transparent color. This indicates tothe user that the image overlap threshold condition is satisfied forcapturing the image. In some embodiments, the user interface 1190 mayflash briefly with an audible shutter clicking sound on the clientdevice 115 to indicate that the image has been captured. In FIG. 11D,the user interface 1190 may reset and the ball 1101 may disappear fromthe user interface 1190 after the image has been captured until theclient device 115 starts to move again in one of the directions over theshelf.

In some embodiments, the feature extraction module 203 receivessubsequent captured images following a first captured image of an objectof interest with little to no tilt between the images. The user guidancemodule 207 instructs the user interface module 211 to generate a userinterface to guide the user to capture an overlapping image with littleto no tilt in any of the axis of orientations (e.g., X, Y, or Z axis).The overlapping images with little to no tilt may be expected by thestitching algorithm for creating a high resolution linear panoramicimage which in turn may enable better image recognition. In someembodiments, the user guidance module 207 receives gyroscopic sensordata including tilting of the client device 115 in any of the three axesof orientation. The gyroscopic sensor data can be generated by theorientation sensors 245 included within the client device 115 thatmeasure an angle of rotation in any of the three axes. For example, theangle of rotation in the X axis is defined by the pitch parameter, theangle of rotation in the Y axis is defined by the yaw parameter, and theangle of rotation in the Z axis is defined by the roll parameter. Theuser guidance module 207 determines whether the client device 115 istilting in one of the axes of orientation when pointed at the object ofinterest based on the gyroscopic sensor data. The user guidance module207 instructs the user interface module 211 to generate a visuallydistinct indicator for tilt on the user interface of the client device115 responsive to the client device 115 tilting in one or more of theaxes of orientation. The position and/or appearance of the visuallydistinct indicator for tilt on the user interface may be coupled to thetilting/orientation of the client device 115 in such a way that it canindicate through instantaneous feedback when there is a tilt associatedwith the client device 115 in any of the three axes of orientation. Inone example, the visually distinct indicator for tilt can be agradient-based indicator to show tilt feedback on the periphery of theuser interface on the client device 115. The gradient-based indicatorcan differ in colors for example, a red color for indicating roll, ablue color for indicating pitch, and a white color for indicating yaw.In another example, the visually distinct indicator for tilt can be ahorizon line displayed at the center of the user interface on the clientdevice 115. In another example, the visually distinct indicator for tiltcan be an angle offset indicator to show the angle of rotation about theX axis, Y axis, and Z axis of orientation on the user interface of theclient device 115. In another example, the visually distinct indicatorfor tilt can be a line connecting two arrow points on opposite sides ofthe user interface displayed on the client device 115. The movement ofthe line connecting the two arrow points across the user interface maybe configured to show tilt feedback on the user interface. In yetanother example, the visually distinct indicator for tilt can be acombination of the gradient-based indicator, the horizon line, and theline connecting the two arrow points. In some embodiments, the userguidance module 207 instructs the user interface module 211 to generatea warning notification on the user interface to indicate to the userthat the tilt has to be rectified first before the image of the objectof interest can be captured.

As shown in the example of FIGS. 12A-12C, the graphical representationsillustrate embodiments of the user interface displaying a visuallydistinct indicator for tilt when the client device 115 is rolling aboutthe Z axis. In FIG. 12A, the graphical representation illustrates a userinterface 1200 that includes a pair of roll reference arrows 1201 a-1201b, a pair of pitch reference arrows 1209 a-1209 b and a horizon line1203 over a current preview image 1205 of the shelf as displayed on theclient device 115. The roll reference arrows 1201 a-1201 b arepositioned at the top and the bottom peripheral portion of the userinterface 1200. They are connected by a thin straight line 1207 and mayserve as the visually distinct indicator for rolling. The pitchreference arrows 1209 a-1209 b are positioned on the left and the rightperipheral portion of the user interface 1200. They are connected by athin straight line 1211 and may serve as the visually distinct indicatorfor pitching. In FIG. 12A, the roll reference arrows 1201 a-1201 bconnected by the thin straight line 1207, the pitch reference arrows1209 a-1209 b connected by the thin straight line 1211 and the horizonline 1203 are in neutral roll position since the client device 115 isnot tilted pointing at the shelf In FIG. 12B, the graphicalrepresentation illustrates an updated user interface 1230 when theclient device 115 is rolling to the left while being parallel to theshelf. The roll reference arrows 1201 a-1201 b connected by the thinstraight line 1207 move to the left of the user interface 1230 toindicate the extent of roll associated with the client device 115pointing at the shelf. The pitch reference arrows 1209 a-1209 bconnected by the thin straight line 1211 do not change position sincethe client device 115 is not pitching. In addition to the roll referencearrows 1201 a-1201 b, the user interface 1230 also includes a rollgradients 1213 a and 1213 b on the periphery of the user interface 1230to serve as the visually distinct indicator for rolling. The rollgradients 1213 a and 1213 b indicates how off center the tilt is becauseof the roll to the left. The horizon line 1203 provides additionalinformation about how far away the client device 115 is from the neutralroll position. In FIG. 12C, the graphical representation illustratesanother updated user interface 1260 when the client device 115 isrolling to the right while being parallel to the shelf. The rollreference arrows 1201 a-1201 b connected by the thin straight line 1207move to the right of the user interface 1260 to indicate the extent ofroll associated with the client device 115 pointing at the shelf. Theroll gradients 1213 a-1213 b again indicate how off center the tilt isbecause of the roll to the right and the horizon line 1203 shows how faraway the client device 115 is from the neutral roll position. In someembodiments, the ball 1215 in the FIGS. 12B and 12C may turn a differentcolor yellow to indicate that the client device 115 is rolling to theleft or to the right. In some embodiments, the ball 1215 may becomecentered within the inner target outline 1217 when there is a decentoverlap with a previously captured image. The user guidance module 207instructs the user interface module 211 to generate a warningnotification on the user interface to indicate to the user that the tilthas to be rectified first before the image can be captured. In someembodiments, the roll reference arrows 1201 a-1201 b may be absent inthe user interface. The user interface 1200 shown in the graphicalrepresentation of FIG. 12A may be updated to display horizontal gridlines (not shown) instead of the roll reference arrows 1201 a-1201 b.The horizontal grid lines may be displayed visually over the currentpreview image 1205. If tilt occurs when the client device 115 is rollingabout the Z axis, the horizon line 1203 displayed over the currentpreview image 1205 in the user interface may disengage from the neutralroll position and rotate about the center in either clockwise oranticlockwise direction depending on whether the capture device isrolling left or rolling right. The position of the horizon line 1203 forroll tilt on the user interface may be coupled to the movement of theclient device 115. The aim for the user is to align and position thehorizon line 1203 parallel to the grid lines by moving the client device115. This may be done to rectify the roll tilt before the image can becaptured.

As shown in the example of FIGS. 13A-13C, the graphical representationsillustrate embodiments of the user interface displaying a visuallydistinct indicator for tilt when the client device 115 is pitching aboutthe X axis. In FIG. 13A, the graphical representation illustrates a userinterface 1300 that includes a pair of pitch reference arrows 1301a-1301 b and a pair of roll reference arrows 1303 a-1303 b over acurrent preview image 1305 of the shelf as displayed on the clientdevice 115. The pitch reference arrows 1301 a-1301 b are positioned onthe left and the right peripheral portion of the user interface 1300.The pitch reference arrows 1301 a-1301 b are connected by a thinstraight line 1307 and may serve as the visually distinct indicator forpitch. In FIG. 13A, the pitch reference arrows 1301 a-1301 b are inneutral pitch position since the client device 115 is not tiltedpointing at the shelf In FIG. 13B, the graphical representationillustrates an updated user interface 1330 when the client device 115 ispitching forward. The top of the client device 115 is closer to the topof the shelf and products toward the top of the shelf appear large onthe current preview image 1205. The pitch reference arrows 1301 a-1301 bconnected by the thin straight line 1307 move to the top of the userinterface 1330 to indicate the extent of pitch associated with theclient device 115 pointing at the shelf. The pair of roll referencearrows 1303 a-1303 b connected by the thin straight line 1309 do notchange position since the client device 115 is not rolling. In additionto the pitch reference arrows 1301 a-1301 b, the user interface 1330also includes a pitch gradients 1311 a and 1311 b on the periphery ofthe user interface to serve as the visually distinct indicator forpitching. The pitch gradients 1311 a and 1311 b indicate how much pitchis being sensed by the client device 115. In FIG. 13C, the graphicalrepresentation illustrates another updated user interface 1360 when theclient device 115 is pitching backward. The bottom of the client device115 is closer to the bottom of the shelf and products towards the bottomof the shelf appear large on the current preview image 1305. The pitchreference arrows 1301 a-1301 b connected by the thin straight line 1307move to the bottom of the user interface 1360 to indicate the extent ofpitch associated with the client device 115 pointing at the shelf. Thepitch gradients 1311 a and 1311 b again indicate how much pitch is beingsensed by the client device 115 when it is pitching backward. In someembodiments, the ball 1313 in the FIGS. 13B and 13C may turn a differentcolor to indicate that the client device 115 is pitching forward orbackward.

As shown in the example of FIGS. 14A-14B, the graphical representationsillustrate embodiments of the user interface displaying a visuallydistinct indicator for tilt when the client device 115 is tilting inboth X and Z axes. In FIG. 14A, the graphical representation illustratesa user interface 1400 when the client device 115 is pitching forward androlling to the left while being pointed at the shelf. The thin straightline 1415 connecting the roll reference arrows 1407 a-1407 b and thethin straight line 1417 connecting the pitch reference arrows 1411a-1411 b cross each other outside the inner target outline 1403 to formthe cross point 1401. The position of the cross point 1401 outside theinner target outline 1403 may indicate to the user visually that theclient device 115 is tilting in the X axis or in the Z axis or in boththe X and Z axes. In FIG. 14B, the graphical representation illustratesanother user interface 1450 when the client device 115 is pitchingbackward and rolling to the right while being pointed at the shelf. Thecross point 1401 is again located outside the target outline 1403 whichindicates to the user visually that the client device 115 is tilting inthe X axis or in the Z axis or in both the X and Z axes. In FIGS. 14Aand 14B, the peripheral portion of the user interfaces 1400 and 1450including the gradient-based indicators (e.g., roll gradients 1409a-1409 b, pitch gradients 1413 a-1413 b, etc.) may change color toindicate to the user visually that the client device 115 is tilting toomuch in one or more axes. The roll reference arrows 1407 a-1407 bconnected by the straight line 1415 glide left and right and the pitchreference arrows 1411 a-1411 b connected by the straight line 1417 glideup and down on peripheral of the user interfaces 1400 and 1450 inconjunction with their corresponding roll gradients 1409 a-1409 b in theroll (Z) axis and pitch gradients 1413 a-1413 b in the pitch (X) axis toprovide instantaneous feedback to the user regarding the tilt.

In some embodiments, the alignment module 205 receives a request fromthe user to align a current preview image of the object of interest asdisplayed by the client device 115 with a view point of a previouslycaptured image after an interruption in the sequence of image capturepattern. For example, the user may get interrupted while capturing animage of a portion of object of interest and may have to leave the scenefor a period of time. The user may then want to return to continuecapturing subsequent images of the object of interest. In some cases,the user may not remember where they were interrupted in the imagecapture process. In the example of capturing images of a shelving unitin an aisle, it is critical to restart the image capture process at thesame position more or less where the last image was captured beforeinterruption. In some embodiments, the visually distinct indicators foroverlap and/or direction may not function unless the user restarts theimage capture process from a position of good overlap with thepreviously captured image. It is important to find a general area wherethe previous image of the object of interest was captured by the clientdevice 115 before restarting the image capture process.

In some embodiments, the feature extraction module 203 identifies thepreviously captured image as a ghost image with which a realignment ofthe preview image is desired and sends the ghost image to the alignmentmodule 205. The alignment module 205 instructs the user interface module211 to generate a user interface that places the previously capturedimage as a ghost image on top of the current preview image beingdisplayed by the client device 115. For example, the user may walk overto a location along the object of interest where they understand thelast image was previously captured and use the overlay of the ghostimage on top of the current preview image to start the realignmentprocess. The ghost image may appear as a semi-transparent mask overlaidupon the preview image. The alignment module 205 instructs the userinterface module 211 to update the user interface with a visuallydistinct indicator for guiding a movement of the client device 115 toproduce a desired realignment. The visually distinct indicator forrealignment can be visually distinct by one or more from the group of ashape, a size, a color, a position, an orientation, and shading. Thefeature extraction module 203 determines image features for the previewimages in the camera preview as the client device 115 moves along theobject of interest and sends the image features to the alignment module205. The alignment module 205 couples the position of the visuallydistinct indicator for realignment on the user interface with themovement of the client device 115. The alignment module 205 dynamicallycompares the identified features of the previously captured image of theobject of interest with the features of the current preview image in thedirection of movement of the client device 115. For example, the set ofimage features for the previously captured image may be represented asF₀. The set of image features determined for a preview image frame maybe represented by F_(i). As the client device 115 moves along the objectof interest to realign with the previously captured image, the featureextraction module 203 generates image features for each preview imageframe. If variable ‘i’ in F_(i) is equal to five (i.e. five previewimage frames have been captured not counting the previously capturedimage and the fifth preview image frame is F₅), then the alignmentmodule 205 determines a similarity function to compare the previouslycaptured image F₀ to the current preview image F₅ to generate asimilarity measure S₅. For example, the similarity function can berepresented as sim (F₀, F₅)=S₅. This value S₅ represents how similar thetwo images are to each other and indicates how far the user must movealong the object of interest to realign with the previously capturedimage. The similarity measure S₅ indicates a comparison with thepreviously captured image F₀ serving as the reference image and not withthe last image feature set F₄ that precedes the image feature set F₅.The alignment module 205 then translates the dynamic comparison in thedirection of movement (i.e., similarity function) into changing theposition of the visually distinct indicator on the user interface suchthat it provides the user with feedback on how to move the client device115 to achieve a proper realignment with the previously captured image.In some embodiments, the alignment module 205 receives a confirmationfrom the user interface module 211 that the realignment is successful.In some embodiments, the alignment module 205 instructs the userinterface module 211 to update the user interface to indicate that therealignment is successful and return the user interface from realignmentmode to capture mode that can guide the user on how to capture the nextimage in the series of images.

As shown in the example of FIG. 15, the graphical representation 1500illustrates an embodiment of the realignment process for generating thevisually distinct indicator for realignment. In FIG. 15, the graphicalrepresentation 1500 includes camera preview frames 1504 for changingimage frames (F₁ to F₄) based on the user moving the client device 115along an object of interest. The graphical representation 1500 alsoincludes a similarity measure function 1506 computed between features ofeach preview image frame 1504 and the features of the previouslycaptured image 1502. As described before, the similarity measurefunction 1506 represents how similar each preview image frame 1504 is tothe previously captured image 1502 and indicates how the user must movethe client device 115 along the object of interest to realign a previewimage with the previously captured image 1502.

As shown in the example of FIGS. 16A-16D, the graphical representationsillustrate embodiment of the user interface displaying realigningcurrent preview image displayed on a client device 115 with a previouslycaptured image. In FIG. 16A, the graphical representation illustrates auser interface 1600 that includes a ball 1601 and a pair of targetoutlines 1603 and 1604 of concentric circles over a ghost image 1605appearing on top the current preview image 1607 of the shelf asdisplayed by the client device 115. The ball 1601 serves as the visuallydistinct indictor for realignment. The inner target outline 1603 mayappear modified with an ‘X’ crosshair to indicate that the userinterface is in realignment mode. The inner target outline 1603 assumesthe same appearance as the align button 1609 which the user of theclient device 115 selects to start the alignment. The inner targetoutline 1603 serves as a target boundary region within which to positionthe visually distinct indicator for realignment. The aim for the user isto align and position the ball 1601 within the target outline 1603 onthe user interface 1600 by moving the client device 115 to achievealignment with the ghost image 1605. In FIG. 16B, the graphicalrepresentation illustrates an updated user interface 1630 that displaysthe ball 1601 moving closer to the inner target outline 1603 as thepreview image 1607 is appearing to realign with the ghost image 1605. InFIG. 16C, the graphical representation illustrates another userinterface 1660 that displays an updated inner target outline 1603 toshow realignment is almost complete and the ball 1601 is almost insidethe inner target outline 1603. The inner target outline 1603 is back toa regular crosshair. In FIG. 16D, the graphical representationillustrates the user interface 1690 updated to display the currentpreview image 1607 after realignment. The ghost image 1605 from FIG. 16Cis no longer overlaid upon the preview image 1607 since the realignmentis successful. This indicates to the user that the user interface 1690is switched from realignment mode to capture mode and is now ready tocapture a next image of the object of interest.

As shown in the example of FIGS. 17A-17F, the graphical representationsillustrate another set of embodiments of the user interface displayingrealigning current preview image displayed on a client device 115 with apreviously captured image. In FIG. 17A, the graphical representationillustrates a user interface 1700 that includes an image 1702 of theshelf as being captured by the client device 115 when the ball 1712 getsto be within the inner target outline 1714. The user interface 1700includes a region 1704 for displaying a mosaic preview 1706 of theimages of the shelf that may have been captured so far by the clientdevice 115. The mosaic preview 1706 includes an empty thumbnail slot1708 serving as the placeholder for a thumbnail representation of theimage 1702 to be affixed to the mosaic preview 1706. The empty thumbnailslot 1708 is labeled ‘4’ since the image 1702 is the fourth image of theshelf to be captured by the client device 115. The user interface 1700also indicates a number of images of the shelf that may have beencaptured so far with a text 1710 indicating the capture of three images.An example embodiment in reference to the mosaic preview and itsconstruction is described in more detail in FIGS. 20A-20I. In FIG. 17A,when the user selects the pause button 1712 to take a break from thecapture process, the user interface 1700 goes from active capture modeinto realignment mode. In FIG. 17B, the graphical representationillustrates a user interface 1715 that includes a modified inner targetoutline 1717 for realignment overlaid upon a ghost image 1719 (asemi-transparent image mask) of the previously captured image (i.e.image 1702 from FIG. 17A). The ghost image 1719 is displayed on theclient device 115 when the user hits the realign button 1721 to continuethe image capture process after the break. The user interface 1715updates the mosaic preview 1706 to include an empty thumbnail slot 1723labeled ‘5’ to indicate a location where the fifth image of the shelfmay get placed once the realignment is achieved and an image of theshelf is captured. In some embodiments, the mosaic preview 1706 may alsoprovide a visual reminder to the user of the client device 115 as tofrom where on the shelf to start the realignment. In FIG. 17C, thegraphical representation illustrates a user interface 1730 that guidesthe movement of the client device 115 to realign a preview image 1732with a ghost image 1719 of the previously captured image. The userinterface 1730 indicates that the client device 115 is pitching forwardand the preview image 1732 is nowhere close in appearance to the ghostimage 1719 of the previously captured image. The user interface 1730overlays the ghost image 1719 over the current preview image 1732 forvisually guiding the movement of the client device 115. There is noappearance of a visually distinct indicator such as a ball yet since theclient device 115 is pitching. The ball makes an appearance when thepreview image 1732 and the ghost image 1719 begin to somewhat realignwith each other. In FIG. 22D, the graphical representation illustratesan updated user interface 1745 that displays the ball 1747 making anappearance outside the modified target outline 1717. The appearance ofthe ball 1747 indicates that an overlap/realignment between the currentpreview image 1732 and the ghost image 1719 of the previously capturedimage has been detected since the client device 115 has moved closer toa location of the previously captured image on the shelf. In FIG. 17E,the graphical representation illustrates another user interface 1760that displays an updated location for the ball 1747 near the targetoutline 1717 to show realignment is almost complete because of thedevelopment of a good overlap between the preview image 1732 and theghost image 1719. In FIG. 17F, the graphical representation illustratesthe user interface 1775 updated to display the current preview image1732 after realignment is achieved. There is no overlay of the ghostimage 1719 from FIG. 17E in the updated user interface 1775 since therealignment is successful. This indicates to the user that the usermoved the client device 115 close enough to have the ball 1747 insidethe inner target outline 1717 in FIG. 17E. The user interface 1775 isnow changed to capture mode with the switch back to the inner targetoutline 1714 from FIG. 17A and ready to capture a next image of theobject of interest since realignment is complete.

In a retail setting, the process of capturing the state of the shelvesmay require snapping a lot of images with the appropriate amount ofoverlap. For example, a minimum of 18 to 24 images may be captured for a16 feet×8 feet linear shelving unit. In the process of capturing theseries of images for creating a linear panoramic image, the user mayforget the direction (e.g., north, south, east or west) to move theclient device 115 to capture a subsequent image. In some cases, the usermay end up moving the client device 115 in the wrong directionaltogether or in the direction where images have already been captured.For example, the user may move the client device 115 to the east alongthe object of interest when the user originally may have had to move theclient device 115 to the south along the object of interest. Suchmistakes may not to be conducive to creating a high resolution linearpanoramic image of the object of interest and may unduly increase thetime spent capturing images of the object of interest. In someembodiments, the user guidance module 207 instructs the user interfacemodule 211 to generate user interface elements that can guide the userin the appropriate direction for capturing the series of images.

In some embodiments, the user guidance module 207 instructs the userinterface module 211 to generate a user interface for providing avisually distinct indicator for direction to indicate to the user tomove the client device 115 in the specified direction for capturing thesubsequent image in the series of images. In some embodiments, the userguidance module 207 receives a determination from the alignment module205 whether there is an overlap occurring between the previouslycaptured image of the object of interest and the current preview imagedisplayed by the client device 115 based on dynamic feature comparison.The user guidance module 207 determines the direction of movement of theclient device 115 based on the overlap occurrence. The user guidancemodule 207 instructs the user interface module 211 to generate thevisually distinct indicator for direction on the user interface in thedirection of movement. The visually distinct indicator for direction canbe visually distinct by one or more from the group of a shape, a size, acolor, a position, an orientation, and shading.

In some embodiments, the user guidance module 207 receives a userselection of a pattern of image capture for capturing the series ofimages. For example, the selected patterns of image capture may be onefrom a group of a serpentine scan pattern, a raster scan pattern, and anover-and-back scan pattern. As shown in the example of FIG. 18, thegraphical representation illustrates an embodiment of the serpentinescan pattern of image capture. The graphical representation 1800includes a left-to-right serpentine pattern 1802 and a right-to-leftserpentine pattern 1804 for capturing images linearly across an objectof interest. The left-to-right serpentine pattern 1802 and theright-to-left serpentine pattern 1804 are shown as starting from the topleftmost position and the top rightmost position, respectively. In otherembodiments, the left-to-right serpentine pattern 1802 and theright-to-left serpentine pattern 1804 may start from the bottom leftmostposition and the bottom rightmost position respectively. The serpentinepattern of image capture may take into account the height and width ofthe object of interest such that the movement of the client device 115in the serpentine pattern can capture the object of interest completelyin the series of images. The client device 115 can be parallel to andfacing the object of interest when following the serpentine pattern ofimage capture. The numerals inside the circles 1806 in the right-to-leftserpentine pattern 1804 for example, indicate the sequence to follow forcapturing the series of images and the arrows 1808 indicate thedirection of movement of the client device 115 in the right-to-leftserpentine pattern 1804 for capturing the series of images. In someembodiments, the user guidance module 207 determines a direction ofmovement for the client device 115 for capturing the series of imagesbased on the user selected pattern of image capture. The user guidancemodule 207 instructs the user interface module 211 to generate thevisually distinct indicator for direction on the user interface based onthe capture flow as specified by the user selected pattern of imagecapture. For example, the user interface module 211 may generate thevisually distinct indicator for direction at the center of the userinterface to indicate the zigzag movement of capturing images linearlyacross the object of interest. The visually distinct indicator fordirection may freely point to any direction in 360 degrees at the centerof the user interface. The visually distinct indicator for direction maybe overlaid upon the current preview image of the object of interest onthe user interface. An example embodiment is described below in moredetail with reference to FIGS. 20A-20I.

In some embodiments, the user guidance module 207 instructs the userinterface module 211 to generate a mosaic preview of the images capturedthus far on the user interface for indicating the image capture progressinformation to the user. For example, the mosaic preview may display anoverview of progress of what has been captured so far relating to theobject of interest. In some embodiments, the user guidance module 207instructs the user interface module 211 to highlight a position orlocation with an outline on the mosaic preview. The outline indicatesthe location where the next image to be captured of the object ofinterest may be placed. The outline may be replaced with a thumbnailimage representation of the object of interest after the image getscaptured by the client device 115. The mosaic preview may be aprogressively growing mosaic preview based on the number of capturedimages. For example, the mosaic preview may include a numbered thumbnailimage of each captured image and a numbered outline of an emptythumbnail slot at the location where the next captured image may getplaced in the mosaic preview. Each thumbnail image appears on the mosaicpreview after the image of the object of interest corresponding to thelocation of the thumbnail on the mosaic preview is captured. Users canpreview the images captured thus far on the mosaic preview and identifywhether the images captured are appropriate for a given retail category.

As shown in the example of FIG. 19, the graphical representation 1900illustrates an embodiment of constructing a mosaic preview using imagesof a shelving unit. The graphical representation 1900 includes anoutline representation 1902 of six individual images (numbered 1-6)captured of a shelving unit 1904. The graphical representation 1900illustrates that the six images (numbered 1-6) are captured withappropriate overlap (e.g., approximately 60%). The graphicalrepresentation 1900 also includes a reconstruction of the capturedimages in the format of a mosaic preview 1906.

In some embodiments, the user guidance module 207 may determine adirection of movement of the client device 115 along the object ofinterest for capturing images under the selected pattern of imagecapture. For example, the user may initiate the capture session forcapturing images of the shelving unit in an aisle from the upperleftmost location (or lower leftmost location) and move the clientdevice 115 to the right linearly along the shelving unit for capturingthe rest of the images in the series. In another example, the user mayinitiate the capture session for capturing images of the shelving unitin an aisle from the upper rightmost location (or lower rightmostlocation) and then move the client device 115 to the left linearly alongthe shelving unit for capturing the rest of the images in the series. Inthe above examples, the selected pattern of image capture by the usermay be the serpentine pattern of image capture as described in FIG. 18.In some embodiments, the user guidance module 207 may determine thelateral direction for the serpentine pattern of image capture which theuser has selected as the pattern for moving the client device 115 alongthe object of interest. In some embodiments, the user guidance module207 may identify whether a subsequent image is captured lateral to aprevious image in the sequence of the serpentine pattern of imagecapture and determine the direction of the serpentine pattern of theimage capture. For example, the user may capture a first image of theshelving unit from the top and move the client device 115 to the bottomof the shelving unit to capture a second image of the shelf. At thismoment, the user may move the client device 115 laterally either to theleft or to the right for capturing a third image in the series. The userguidance module 207 identifies whether the third image is capturedlaterally to the left or the right of the second image of the shelf anddetermines the direction of movement of the client device 115 along theshelf in the aisle. For example, if the third image captured was to theleft of the second captured image, the user guidance module 207determines that the direction is a right-to-left serpentine pattern forcapturing images linearly across the shelving unit. In another example,if the third image captured was to the right of the second capturedimage, the user guidance module 207 determines that the direction is aleft-to-right serpentine pattern for capturing images. Accordingly, insome embodiments, the user guidance module 207 instructs the userinterface module 211 to generate or update the visually distinctindicator for direction on the user interface for capturing subsequentimages of the object of interest based on the lateral directionidentified for the serpentine pattern of image capture.

In some embodiments, the user guidance module 207 instructs the userinterface module 211 to update the mosaic preview of the captured imagesto indicate the direction of movement of the client device 115 along theobject of interest. For example, the mosaic preview may be pushed to theleft of the user interface to indicate the client device 115 isfollowing a left-to-right serpentine pattern of image capture. Inanother example, the mosaic preview may be pushed to the right of theuser interface to indicate that the client device 115 is following aright-to-left serpentine pattern of image capture.

As shown in the example of FIGS. 20A-20I, the graphical representationsillustrate embodiments of the user interface displaying visuallydistinct indicator for direction of movement of the client device 115.

In FIG. 20A, the graphical representation illustrates a user interface2000 that includes a pair of target outlines 2002 and 2004 of concentriccircles overlaid upon a current preview image 2006 of the shelving unitas displayed on the client device 115. The user interface 2000 alsoincludes a region 2008 for displaying a mosaic preview 2010 of capturedimages below the current preview image 2006. The mosaic preview 2010 mayprogressively grow based on the captured images of the shelving unitbeing added to it. The mosaic preview 2010 included within the region2008 can be pushed either to the right of the region 2008 or to the leftof the region 2008 depending on whether a movement of client device 115along the shelving unit is from right to left or from left to right. Themosaic preview 2010 (shown empty) in the region 2008 includes an outlinelabeled ‘1’ of an empty thumbnail image slot which can get replaced witha first image of the shelving unit when the client device 115 capturesthe first image of the shelving unit. In FIG. 20B, the graphicalrepresentation illustrates an updated user interface 2015 that includesan arrow 2017 hanging just outside the inner target outline 2002 toserve as the visually distinct indicator for direction. The arrow 2017can swivel around the inner target outline 2002 and point in anydirection in 360 degrees. The arrow 2017 can be customized to be of anycolor, shape, shading, size, symbol, etc. The user interface 2015 alsoincludes a ball 2019 that serves as the visually distinct indicator foroverlap. The arrow 2017 is pointing down on the user interface 2015 toindicate to the user to move the client device 115 down for capturing anext image of the shelving unit. The mosaic preview 2010 included withinthe region 2008 now includes an outline 2021 labeled ‘2’ for a secondimage to fit into the mosaic preview 2010 at a location as shown by theoutline 2021. The outline labeled ‘1’ of the mosaic preview 2010 fromFIG. 20A is no longer visible in FIG. 20B because a thumbnailrepresentation of the first image of the shelving unit has replaced theoutline labeled ‘1’ in the mosaic preview 2010. In association with thearrow 2017, the location of the outline 2021 on the mosaic preview 2010also serves to visually indicate where along the shelf to move theclient device to capture the second image. The second image can becaptured by moving the client device 115 in the downward direction toproduce a decent overlap with the first image. The arrow 2017 disappearswhen the ball 2019 passes through the outer target outline 2004 as it isno longer needed to indicate the direction. When the ball 2019 isaligned and positioned within the inner target outline 2002, the secondimage may be captured. A thumbnail of the captured second image mayreplace the outline 2021 labeled ‘2’ in the mosaic preview 2010 includedwithin the region 2008. The undo button 2023 when pressed by the usermay allow the user to back up the shelf and retake the second image ifneeded. In FIG. 20C, the graphical representation illustrates an updateduser interface 2030 that includes two arrows as an example: a rightarrow 2032 and a left arrow 2034 to demonstrate two possible paths theuser can take to capture a next image in the series. In someembodiments, the user interface 2030 may display either the right arrow2032 or the left arrow 2034 at a time. The mosaic preview 2010 includedwithin the region 2008 in the user interface 2030 now includes twooutlines: a left outline 2036 labeled ‘3 a’ and a right outline 2038labeled ‘3 b’ to indicate to the user to capture a next image that maybe either to the left or right of the previous image. Assuming the useris going to move to the right, the user interface 2030 may be updated todisplay the right arrow 2032 and the ball 2019 on the right of the userinterface as the user begins to move the client device 115 to the right.When the ball 2019 is aligned and positioned within the inner targetoutline 2002, the third image may be captured. A thumbnail of thecaptured third image may replace the outline 2038 labeled ‘3 b’ in themosaic preview 2010 included within the region 2008. In someembodiments, the user guidance module 207 determines the direction ofmovement of the client device 115 along the shelf in the aisleresponsive to the third image being captured laterally to the secondimage. For example, the third image captured was to the right of thesecond captured image, the user guidance module 207 determines that thedirection of movement of the client device 115 is a left-to-rightserpentine pattern for capturing images. In some embodiments, the userguidance module 207 may instruct the user interface module 211 topresent the user with a user interface and request the user to indicatefrom which side of the object of interest (e.g., aisle) to start theimage capture process. The user guidance module 207 determines thedirection of movement of the client device 115 based on the user input.

In FIG. 20D, the graphical representation illustrates an updated userinterface 2045 that includes up arrow 2047 to indicate to the user tomove the client device 115 upward to capture the next image of theshelving unit. The user interface 2045 includes a ball 2019 appearing atthe top. The mosaic preview 2010 included within the region 2008 nowincludes an outline 2049 labeled ‘4’ for a fourth image to fit into themosaic preview 2010 as shown by the location of the outline 2049. Thefourth image can be captured by moving the client device 115 in theupward direction and when the ball 2019 gets aligned and positionedwithin the inner target outline 2002. The mosaic preview 2010 in theregion 2008 is pushed all the way to the left of region 2008 to visuallyindicate that the direction of movement of the client device 115 is fromthe left to the right of the shelving unit. In FIG. 20E, the graphicalrepresentation illustrates an updated user interface 2060 that includesright arrow 2062 to indicate to the user to move the client device 115to the right to capture the fifth image of the shelving unit. Similarly,in FIG. 20F, the graphical representation illustrates an updated userinterface 2075 that includes down arrow 2077 to indicate to the user tomove the client device 115 down to capture the sixth image of theshelving unit. In FIGS. 20G-20I, the graphical representationsillustrate embodiments of alternate user interfaces based on the userchoosing to move the client device 115 to the left in FIG. 20C. In FIGS.20G-20I, the mosaic preview 2010 in the region 2008 is pushed all theway to the right of region 2008 to visually indicate that the directionof movement of the client device 115 is from the right to the left ofthe shelving unit.

As shown in the example of FIG. 21, the graphical representationillustrates another embodiment of the user interface 2100 displayingvisually distinct indicator for direction of movement of the capturedevice. In FIG. 21, the graphical representation illustrates a userinterface 2100 that includes an arrow 2104 outside the perimeter of anouter target outline 2102 to serve as the visually distinct indicatorfor direction. The arrow 2104 can swivel around the outer target outline2102 and point in any direction in 360 degrees.

In some embodiments, the stitching module 209 receives the images fromthe feature extraction module 203 and sends the set of captured imagesalong with the overlap information from the client device 115 to therecognition server 101 for stitching a single linear panoramic image. Insome embodiments, the stitching module 209 compares the extractedfeatures of each individual image in the set of captured image to thosefeatures stored in the data storage 243 for recognition. The stitchingmodule 209 identifies for example, the products in the individual imagesand uses such information in combination with the overlap informationfor stitching the set of captured images together into a single linearpanoramic image. As shown in the example of FIGS. 22A-22B, the graphicalrepresentations illustrate embodiments of the user interface forpreviewing the set of captured images in a mosaic. In FIG. 22A, thegraphical representation illustrates a user interface 2200 displaying amosaic 2201 previewing the set of all images of the shelf that have beencaptured so far and stitched together in a single panoramic image usingthe overlap information and image features obtained when the images werecaptured. For example, the overlap of the images shown in the userinterface 2200 may be approximately the same as the overlap thresholdparameter of 60 percent. The user interface 2200 also includes a tab2203 which the user can slide to view a highlighting of thumbnail imagesof each one of the individually captured images. In FIG. 22B, thegraphical representation illustrates a user interface 2250 highlightingthumbnail images of each one of the individually captured images inresponse to the user sliding the tab 2203. For example, the user may tapthe highlighted image 2205 to view the image in a larger preview userinterface. In some embodiments, the stitching module 209 determinesrelevant analytical data including information about the state of theshelf from the linear panoramic image. For example, the stitching module209 may identify out of stock products, unknown products, etc. from thelinear panoramic image. In another example, the stitching module 209 maydetermine planogram compliance using the linear panoramic image. Thestitching module 209 may store the panoramic image and associatedmetadata in the data storage 243. The stitching module 209 may alsoinstruct the user interface module 211 to provide instructions on thedisplay of the client device 115 requesting the user to take correctiveactions in-store. For example, the corrective action may be to arrangethe products on the shelf in compliance with the planogram.

A system and method for capturing a series of images to create a linearpanorama has been described. In the above description, for purposes ofexplanation, numerous specific details are set forth in order to providea thorough understanding of the techniques introduced above. It will beapparent, however, to one skilled in the art that the techniques can bepracticed without these specific details. In other instances, structuresand devices are shown in block diagram form in order to avoid obscuringthe description and for ease of understanding. For example, thetechniques are described in one embodiment above primarily withreference to software and particular hardware. However, the presentinvention applies to any type of computing system that can receive dataand commands, and present information as part of any peripheral devicesproviding services.

Reference in the specification to “one embodiment” or “an embodiment”means that a particular feature, structure, or characteristic describedin connection with the embodiment is included in at least oneembodiment. The appearances of the phrase “in one embodiment” in variousplaces in the specification are not necessarily all referring to thesame embodiment.

Some portions of the detailed descriptions described above are presentedin terms of algorithms and symbolic representations of operations ondata bits within a computer memory. These algorithmic descriptions andrepresentations are, in some circumstances, used by those skilled in thedata processing arts to convey the substance of their work to othersskilled in the art. An algorithm is here, and generally, conceived to bea self-consistent sequence of steps leading to a desired result. Thesteps are those requiring physical manipulations of physical quantities.Usually, though not necessarily, these quantities take the form ofelectrical or magnetic signals capable of being stored, transferred,combined, compared, and otherwise manipulated. It has proven convenientat times, principally for reasons of common usage, to refer to thesesignals as bits, values, elements, symbols, characters, terms, numbersor the like.

It should be borne in mind, however, that all of these and similar termsare to be associated with the appropriate physical quantities and aremerely convenient labels applied to these quantities. Unlessspecifically stated otherwise as apparent from the following discussion,it is appreciated that throughout the description, discussions utilizingterms such as “processing”, “computing”, “calculating”, “determining”,“displaying”, or the like, refer to the action and processes of acomputer system, or similar electronic computing device, thatmanipulates and transforms data represented as physical (electronic)quantities within the computer system's registers and memories intoother data similarly represented as physical quantities within thecomputer system memories or registers or other such information storage,transmission or display devices.

The techniques also relate to an apparatus for performing the operationsherein. This apparatus may be specially constructed for the requiredpurposes, or it may comprise a general-purpose computer selectivelyactivated or reconfigured by a computer program stored in the computer.Such a computer program may be stored in a non-transitory computerreadable storage medium, such as, but is not limited to, any type ofdisk including floppy disks, optical disks, CD-ROMs, and magnetic disks,read-only memories (ROMs), random access memories (RAMs), EPROMs,EEPROMs, magnetic or optical cards, flash memories including USB keyswith non-volatile memory or any type of media suitable for storingelectronic instructions, each coupled to a computer system bus.

Some embodiments can take the form of an entirely hardware embodiment,an entirely software embodiment or an embodiment containing bothhardware and software elements. One embodiment is implemented insoftware, which includes but is not limited to firmware, residentsoftware, microcode, etc.

Furthermore, some embodiments can take the form of a computer programproduct accessible from a computer-usable or computer-readable mediumproviding program code for use by or in connection with a computer orany instruction execution system. For the purposes of this description,a computer-usable or computer readable medium can be any apparatus thatcan contain, store, communicate, propagate, or transport the program foruse by or in connection with the instruction execution system,apparatus, or device.

A data processing system suitable for storing and/or executing programcode can include at least one processor coupled directly or indirectlyto memory elements through a system bus. The memory elements can includelocal memory employed during actual execution of the program code, bulkstorage, and cache memories which provide temporary storage of at leastsome program code in order to reduce the number of times code must beretrieved from bulk storage during execution.

Input/output or I/O devices (including but not limited to keyboards,displays, pointing devices, etc.) can be coupled to the system eitherdirectly or through intervening I/O controllers.

Network adapters may also be coupled to the system to enable the dataprocessing system to become coupled to other data processing systems orremote printers or storage devices through intervening private or publicnetworks. Modems, cable modem and Ethernet cards are just a few of thecurrently available types of network adapters.

Finally, the algorithms and displays presented herein are not inherentlyrelated to any particular computer or other apparatus. Variousgeneral-purpose systems may be used with programs in accordance with theteachings herein, or it may prove convenient to construct morespecialized apparatus to perform the required method steps. The requiredstructure for a variety of these systems will appear from thedescription above. In addition, the techniques are not described withreference to any particular programming language. It will be appreciatedthat a variety of programming languages may be used to implement theteachings of the various embodiments as described herein.

The foregoing description of the embodiments has been presented for thepurposes of illustration and description. It is not intended to beexhaustive or to limit the specification to the precise form disclosed.Many modifications and variations are possible in light of the aboveteaching. It is intended that the scope of the embodiments be limitednot by this detailed description, but rather by the claims of thisapplication. As will be understood by those familiar with the art, theexamples may be embodied in other specific forms without departing fromthe spirit or essential characteristics thereof. Likewise, theparticular naming and division of the modules, routines, features,attributes, methodologies and other aspects are not mandatory orsignificant, and the mechanisms that implement the description or itsfeatures may have different names, divisions and/or formats.Furthermore, as will be apparent to one of ordinary skill in therelevant art, the modules, routines, features, attributes, methodologiesand other aspects of the specification can be implemented as software,hardware, firmware or any combination of the three. Also, wherever acomponent, an example of which is a module, of the specification isimplemented as software, the component can be implemented as astandalone program, as part of a larger program, as a plurality ofseparate programs, as a statically or dynamically linked library, as akernel loadable module, as a device driver, and/or in every and anyother way known now or in the future to those of ordinary skill in theart of computer programming. Additionally, the specification is in noway limited to embodiment in any specific programming language, or forany specific operating system or environment. Accordingly, thedisclosure is intended to be illustrative, but not limiting, of thescope of the specification, which is set forth in the following claims.

What is claimed is:
 1. A method comprising: receiving an image of a portion of an object of interest from a client device for serving as a reference image; receiving a preview image of another portion of the object of interest; generating a user interface including the preview image; adding to the user interface a progressively growing mosaic preview, the mosaic preview including a thumbnail representation of the reference image; comparing dynamically the reference image with the preview image to determine whether an overlap between the reference image and the preview image satisfies a predetermined overlap threshold; and responsive to the overlap between the reference image and the preview image satisfying the predetermined overlap threshold, setting the preview image as the reference image.
 2. The method of claim 1, further comprising: receiving a selection of a pattern of image capture from the client device; determining whether the reference image of the object of interest is identified lateral to a previous reference image; identifying a lateral direction of the pattern of image capture responsive to the reference image being identified lateral to the previous reference image; and adding to the user interface a visually distinct indicator overlaid upon the preview image, the visually distinct indicator identifying a direction for guiding a movement of the client device based on the identified lateral direction of the pattern of image capture.
 3. The method of claim 2, comprising receiving a next preview image of a different portion of the object of interest in the direction identified by the visually distinct indicator responsive to the overlap between the reference image and the preview image failing to satisfy the predetermined overlap threshold.
 4. The method of claim 1, wherein the mosaic preview represents a thumbnail reconstruction of one or more reference images received for the object of interest and identifies an outline of a location for a subsequent reference image of the object of interest for placement in the mosaic preview.
 5. The method of claim 2, comprising sliding the mosaic preview on the user interface in an opposite direction of the identified lateral direction of the pattern of image capture.
 6. The method of claim 2, wherein the pattern of image capture is one from a group of a serpentine scan pattern of image capture, a raster scan pattern of image capture and an over-and-back scan pattern of image capture.
 7. The method of claim 2, wherein the direction for guiding the movement of the client device is one from a group of north, south, east and west.
 8. The method of claim 1, comprising sending the reference images received for the object of interest for generating a single linear panoramic image.
 9. A system comprising: one or more processors; and a memory, the memory storing instructions, which when executed cause the one or more processors to: receive an image of a portion of an object of interest from a client device for serving as a reference image; receive a preview image of another portion of the object of interest; generate a user interface including the preview image; add to the user interface a progressively growing mosaic preview, the mosaic preview including a thumbnail representation of the reference image; compare dynamically the reference image with the preview image to determine whether an overlap between the reference image and the preview image satisfies a predetermined overlap threshold; and responsive to the overlap between the reference image and the preview image satisfying the predetermined overlap threshold, set the preview image as the reference image.
 10. The system of claim 9, wherein the instructions further cause the one or more processors to: receive a selection of a pattern of image capture from the client device; determine whether the reference image of the object of interest is identified lateral to a previous reference image; identify a lateral direction of the pattern of image capture responsive to the reference image being identified lateral to the previous reference image; and add to the user interface a visually distinct indicator overlaid upon the preview image, the visually distinct indicator identifying a direction for guiding a movement of the client device based on the identified lateral direction of the pattern of image capture.
 11. The system of claim 10, wherein the instructions further cause the one or more processors to receive a next preview image of a different portion of the object of interest in the direction identified by the visually distinct indicator responsive to the overlap between the reference image and the preview image failing to satisfy the predetermined overlap threshold.
 12. The system of claim 9, wherein the mosaic preview represents a thumbnail reconstruction of one or more reference images received for the object of interest and identifies an outline of a location for a subsequent reference image of the object of interest for placement in the mosaic preview.
 13. The system of claim 10, wherein the instructions further cause the one or more processors to slide the mosaic preview on the user interface in an opposite direction of the identified lateral direction of the pattern of image capture.
 14. The system of claim 10, wherein the pattern of image capture is one from a group of a serpentine scan pattern of image capture, a raster scan pattern of image capture and an over-and-back scan pattern of image capture.
 15. The system of claim 9, wherein the instructions further cause the one or more processors to send the reference images received for the object of interest for generating a single linear panoramic image.
 16. A computer program product comprising a non-transitory computer readable medium storing a computer readable program, wherein the computer readable program when executed on a computer causes the computer to: receive an image of a portion of an object of interest from a client device for serving as a reference image; receive a preview image of another portion of the object of interest; generate a user interface including the preview image; add to the user interface a progressively growing mosaic preview, the mosaic preview including a thumbnail representation of the reference image; compare dynamically the reference image with the preview image to determine whether an overlap between the reference image and the preview image satisfies a predetermined overlap threshold; and responsive to the overlap between the reference image and the preview image satisfying the predetermined overlap threshold, set the preview image as the reference image.
 17. The computer program product of claim 16, wherein the computer readable program further causes the computer to receive a selection of a pattern of image capture from the client device, to determine whether the reference image of the object of interest is identified lateral to a previous reference image, to identify a lateral direction of the pattern of image capture responsive to the reference image being identified lateral to the previous reference image and to add to the user interface a visually distinct indicator overlaid upon the preview image, the visually distinct indicator identifying a direction for guiding a movement of the client device based on the identified lateral direction of the pattern of image capture.
 18. The computer program product of claim 17, wherein the computer readable program further causes the computer to receive a next preview image of a different portion of the object of interest in the direction identified by the visually distinct indicator responsive to the overlap between the reference image and the preview image failing to satisfy the predetermined overlap threshold.
 19. The computer program product of claim 17, wherein the computer readable program further causes the computer to slide the mosaic preview on the user interface in an opposite direction of the identified lateral direction of the pattern of image capture.
 20. The computer program product of claim 16, wherein the computer readable program further causes the computer to send the reference images received for the object of interest for generating a single linear panoramic image. 